Loading...
A few ethical considerations and limitations in our project, and how can we do better in the future.
The first thing is bias. Apparently the fact that we only discussed about airlines made our bias.
We didn't take other forms of transportation into consideration mainly because they are hard to obtain.
But it's still a fact that our project can do better if we can compare airlines data with trains records or some other forms.
Another bias in our project is that we only considered US regions.
If we can get flights data and covid-19 cases datasets all around the world, we can then do comparison to find out to what degree covid has affect each countries.
Another ethical consideration is about our dataset of tweets.
Even though it is scraped through publicly available, but it is related to specific twitter user. And this data will remains in our dataset even if that user deletes it.
Tweet data will remains in our dataset even if that user deletes it.
As for sentiment analysis, we did come up with a few conclusions by that, but the sentiments within tweets are not accurate enough.
It is too unclear that we can't figure our their thought correctly, result in we cannot rely on those sentimental analysis. So what we are going to work on would be improving the accuracy of our sentimental analysis. Thus making our sentiment analysis more convincing.
Text data are solely analyzed. We didn't combine text data with other datasets when we are doing data analysis. We should definitely try to vectorize these text data and combine them with other forms of data and analyze them together.
We did draw the network graph for opensky dataset, which should generate many useful information about the correlations between pandemic and people's mind about it.
However, since we got few datasets to compare with, we didn't get any useful information, which shouldn't be. So the network analysis should also be a big part of our future work.