Converting a MeteorJs project to a package based architecture

This article is a clone of the original offline: http://experimentsinmeteor.com/package-based-architecture/ Author: Nick Riebeek Furthermore it was used for the design of the framework KeplerJs and…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




How to Plot Timeseries Data in Python and Plotly

A simple tutorial on handling time series data in Python from extracting the dates and others to plotting them to charts.

Handling time series data can be a bit tricky. When I first had to deal with time-series data in Python and put them into charts, I was really frustrated. I probably spent a whole day just trying to figure out how to extract the dates or the months from a series of timestamp data, and at the end of the day, I still didn't understand a lot of things.

I am going to show you some code about this that I have learned recently. Hopefully, my future self or anyone looking for clues to do time series visualization will find this helpful.

I don't want to use dummy data for our examples here, so I am going to use real data instead. I collected all tweets from 1 January 2020 to 31 January 2021 (13 months of data) that had the word “malioboro” in it.

Let's start by importing some important packages and the data themselves.

Since the data collected are in JSON format, I need to make Python read them line by line and convert them to pandas data frame format.

There are exactly 88,035 tweets collected.

For a year-long data, it’s pretty awesome, right? Malioboro is indeed a famous place, otherwise, Twitter wouldn't be talking about it.

The time data aren’t in a standard format yet. If you look at the picture below, you’ll see that it has the timezone name in the back. My laptop’s timezone is set to GMT+07 (Bangkok/Jakarta, South East Asia), and Twint probably followed that format.

Depending on the timezone set in your laptop, you’ll probably see similar cases. We are going to remove these substrings and leave only the date-time to be stored into a new created column.

Now we have the date-time data stored in the created column. However, since we only need the dates and months data, we are going to parse those things using the following code.

To plot the data we need the numbers of tweets per their respective time unit (months or days).

First, we are going to plot the data by month. We do that by making a new data frame consisting of each month and their respective numbers of tweets.

These data are now ready to be plotted.

You can also make a more detailed chart that shows the trend day by day, but I don't suggest this for a longer time data (say, three years or more) because the lines would become really confusing and harder to read.

We do this by creating a data frame that stores the number of tweets day by day.

This chart looks very interesting because now you can see the peaks on certain dates. There must be something important occurring during those days that caused Twitter to talk about Malioboro more than usual.

I am going to highlight these important dates by putting some dots to show their significance in the data.

Here I am going to get the top three peak dates and store them in a data frame.

Then I use this data frame to put notes on the chart. It’s pretty similar to the previous chart we have made, but it has some points that highlight these dates.

You can change the text shown on the points with any text you want, for example, you can use that to write down some notes about the events that caused the peak points.

These charts are interesting, but without a story to accompany them, the readers wouldn't get many clues.

These charts and context have hopefully created some kind of story about our data.

In short, plotting time series data using Plotly are actually pretty simple and straightforward. If you still find some things confusing, that's okay, you don't have to get everything on the first try because sometimes it takes some time to get used to.

Hopefully, you find this tutorial easy to follow, otherwise hit me up with some comments!

You can look at the whole code I used in this article on my Github Repository below. Thanks for reading!

Add a comment

Related posts:

Learn about 4 amazing Winter skincare tips for better nourishment

Skincare is very important for everyone, especially for females. In winter, this skincare becomes essential even more. It is due to the reason that in winter people face harsh weather conditions…

ON CORRELATION AND CAUSALITY

Data scientists are always on the hunt for patterns in data, particularly where there is evidence of causality, for example showing that factor A causes outcome B. Sometimes this is simple: we all…