Loading...

  • Ethical considerations
  • Limitations
  • Future Work

A few ethical considerations and limitations in our project, and how can we do better in the future.

6.1 Bias

6.1.1 Only airlines data included

The first thing is bias. Apparently the fact that we only discussed about airlines made our bias.

We didn't take other forms of transportation into consideration mainly because they are hard to obtain.

But it's still a fact that our project can do better if we can compare airlines data with trains records or some other forms.

6.1.2 Only US regions are considered

Another bias in our project is that we only considered US regions.

If we can get flights data and covid-19 cases datasets all around the world, we can then do comparison to find out to what degree covid has affect each countries.

6.2 Tweets

Another ethical consideration is about our dataset of tweets.

Even though it is scraped through publicly available, but it is related to specific twitter user. And this data will remains in our dataset even if that user deletes it.

6.2.2 Data remains

Tweet data will remains in our dataset even if that user deletes it.

6.3 Sentiment Analysis

6.3.1 Not accurate enough

As for sentiment analysis, we did come up with a few conclusions by that, but the sentiments within tweets are not accurate enough.

It is too unclear that we can't figure our their thought correctly, result in we cannot rely on those sentimental analysis. So what we are going to work on would be improving the accuracy of our sentimental analysis. Thus making our sentiment analysis more convincing.

6.4 Data Combinations

6.4.1 Text data should be somehow considered into analysis

Text data are solely analyzed. We didn't combine text data with other datasets when we are doing data analysis. We should definitely try to vectorize these text data and combine them with other forms of data and analyze them together.

6.4.2 Network graph isn’t combined with text data

We did draw the network graph for opensky dataset, which should generate many useful information about the correlations between pandemic and people's mind about it.

However, since we got few datasets to compare with, we didn't get any useful information, which shouldn't be. So the network analysis should also be a big part of our future work.