Reporting on Covid-19 in Arkansas: Data Journalism
Jour 5283, Fall 2020
Remote Delivery, Monday-Wednesday 9:40 a.m.-10:55 a.m.
Rob Wells, Ph.D.
rswells@uark.edu
@rwells1961
Agenda:
--Discuss syllabus
--Blackboard site
--Teams
--Intellectual Property / Data Sharing Releases
--Arkansascovid.com
--Intro R and R Studio. Open program.
--Read Machlis, Ch. 2
--Advanced learners, see below
Teams
For the first class, please have Teams installed so we can do some exercises.
Teams is free through your university Office365 account. Download the Teams App through the Office365 suite.
https://its.uark.edu/communication-collaboration/office365/office365-desktop-apps.php
Teams Videos
Microsoft Teams allows us to easily share information through the class or in discrete groups.
Chat in Teams
Create a post
How to tag a person in Teams
R and R Studio
Install R and R Studio.
This is free and open source software. It is not large and doesn't tax the memory a lot.
R runs on Windows, Mac and Linux, but this course is designed for the Mac version.
If you use Windows, there may be variations in the lessons and instructions. Please see me for questions.
Installing R is a two-step process:
1) Install R, the actual program
2) Install RStudio, a common interface
1) Download the most recent version of R for Mac:
https://mirrors.nics.utk.edu/cran/bin/macosx/R-4.0.2.pkg
--If you have a Windows computer, go to:
https://mirrors.nics.utk.edu/cran/bin/windows/base/R-4.0.2-win.exe
Accept all of the default settings for Mac.
2) Install RStudio, the interface we use to manage and create R code. Download the open source edition of R Studio desktop and follow the prompts to install it.
https://rstudio.com/products/rstudio/download/#download
Good instructions for installing R
http://www.machlis.com/R4Journalists/download-r-and-rstudio.html
Good overview of the program
https://docs.google.com/presentation/d/1O0eFLypJLP-PAC63Ghq2QURAnhFo6Dxc7nGt4y_l90s/edit#slide=id.p
Intellectual Property / Data Sharing Releases
UofA Rules of Conduct
https://docs.google.com/document/d/1YkdkRIzIs1WQ3P9KIICvHcppfWvwTyo2bRhGwQPsgVE/edit
License Agreement
https://docs.google.com/document/d/1AahzxDOzTf9Z6PBjBvFBOjnn9_BiM4YldnXW-mHZr9s/edit
Machlis, Sharon. Practical R for Mass Communications and Journalism. Chapman & Hall/CRC The R Series. 2018. ISBN 9781138726918 https://www.amazon.com/gp/search?keywords=9781138726918
Ch. 1, Introduction; Ch. 3, See How Much You Can Do in a Few Lines of Code
Chapter 2: Get Started With R in a Few Easy Steps
Here is the link to the firts six chapters for free
http://www.machlis.com/PracticalR4Journalism/index.html (first six chapters are free)
Basic tutorial on R:
https://profrobwells.github.io/Guest_Lectures/Intro_To_R/R1_Intro-to-R.html
CORRECT LINK: Download the code and try it yourself
(Left click, download, remove.txt extension)
https://github.com/profrobwells/CovidFall2020/raw/master/Exercises/R1_Intro%20to%20R.Rmd
Advanced Corner
--Import Arkansascovid.com data and summarize Washington County trends
https://docs.google.com/spreadsheets/d/17M92KbKw1nIOD_co11hN_B0o4AmAW2zs1Nb3QGeDl-I/edit?usp=sharing
Agenda
Arkansascovid.com
Intro to Tableau
Teams
IRE Conference IRE20 Conference will be Sept. 21-25
Exercises and tutorial
Same Goal, New Direction
Our partnership with Arkansascovid.com has changed.
Misty Orpin has asked the Journalism Department to run the website. https://arkansascovid.com/
I said yes.
It's an incredible opportunity for the class and the School of Journalism.
And here is how we will do it.
--The School of Journalism will have a GA and a Graduate Intern dedicated to basic daily update of the site: home page data, key charts, a few key Tweets per day.
--This class will write stories and produce data visualizations based on Arkansascovid.com data. That was our original goal. We will just be doing more and with different tools, primarily Tableau and then R, and in a different way.
And we will be Tweeting and doing podcasts, so that will be new.
Normally, I do not revise the course plan like this after the first day of class. But what an opportunity for all of us!
You and the School of Journalism and Strategic Media will be running a multimedia data site and Twitter feed with 11,000 followers reporting on one of the most important issues facing society.
What is New
This course will emphasize Tableau for the next two months instead of R
We will shift to a more limited examination of R in October
Tableau is less complex than R
Homework assignments will involve Tweeting on the account
IRE Conference: Sept. 21-25
Fellowship:
Tableau Introduction
Please download this tutorial
https://wordpressua.uark.edu/datareporting/tableau/
Nerd Corner
Those of you who don't need a Tableau introduction, please do this
1) Create a new Tableau project and link to this Google Sheet
https://docs.google.com/spreadsheets/d/1g-gkjJOr1sKAu6rZHG04XA5_fM_Ma0jLr5r24fwMqiA/edit?usp=sharing You may need to copy the sheet to your own Google Drive to make it work. Let me know what you find out.
2) Use the Austin Wilkins automated feed of ADH data to replicate any of the arkansascovid.com charts.
Wong, Dona M. The Wall Street Journal Guide to Information Graphics. W. W. Norton & Company. 2013. ISBN 0393347281. https://www.amazon.com/Street-Journal-Guide-Information-Graphics/dp/0393347281 Ch. 1: The Basics
Arkansascovid.com
Gov. Hutchison’s YouTube Channel. Check the Aug. 24 event https://www.youtube.com/channel/UCLJcNdgp2PMEmiqJEoYzqwQ
Importing Census Data examples https://video.uark.edu/media/Census1+-+Importing+Data/1_xykm3ovj
This exercise will test out your Twitter skills. It is not a live Tweeting exercise. That will come next week!
Tweeting: Examine the Arkansascovid.com data for Aug. 24.
https://docs.google.com/spreadsheets/d/1Yq2MdmMfWijmZzJwA2VulWJzOCWSiOa--SasEAGo7XY/edit#gid=1357388768
Create a Google Doc.
Produce four draft tweets based on new cases, deaths and two other items based on Gov. Hutchison's Aug. 24 press conference, found here:
https://www.youtube.com/channel/UCLJcNdgp2PMEmiqJEoYzqwQ
Provide a Tableau image for each Tweet.
Post the Google Doc link on the Blackboard assignment.
Agenda
Update on Daily Work Process
Who Are We?
Review Tweet Assignment. Tableau
Story Pitches
Arkansascovid.com workflow
R Basics
Prepare Questions for Misty Orpin
What is Wrong With This Picture
_________________________________________________________________________________________
Random: My nephew, Nick Wells, on a 40-mile trail run Saturday on Mt Hood in Oregon
_________________________________________________________________________________________
Here’s an update
I made a lot of progress over the weekend figuring out the automation process to gather and crunch the daily numbers. I will discuss this in class but I want you all to check this short video before class tomorrow.
https://video.uark.edu/media/ADH+Data+Flow+to+Arkcovid/0_c6hn4feo
After looking at everything, I turned to R to streamline the data gathering and analysis process. So I will be teaching you guys the basic R in tandem with Tableau. They fit hand in hand.
For this class, the grading be based on a combination of daily task work to keep the site running and a few assignments that highlight the data skills I want you to learn this semester.
Daily Tasks
Some combination of these each week:
See link to Daily Tasks in Teams
Tweeting duties:
Prepare and monitor twitter for research requests, feedback, engagement.
Draft responses. Post in Teams
Data Visualizations
Data Maintenance for Site
Web design, posting
Shorter stories
Improvements
UATV or other media partnership / appearance
Work with another class
Tasks TBD
Each week you’ll put a report of your activities in the “Daily Tasks” Assignment in Blackboard. Instructions are on that assignment.
The grading will be guided by this concept: Make meaningful contributions to the website each week. The term “meaningful contributions” is deliberately vague since we are in an evolving process. I don’t want to set a floor for minimal behavior - I am going to push you pretty hard and want you to multitask effectively. I would view the grading like this: Someone will get less than an A if they don’t communicate or flake out on an assignment. If you demonstrate to me that you are trying, we don’t have a problem.
Assignments
There are still four in the semester but there will be more flexibility on when they are handed in and what the content will involve. The basic skills I want everyone to learn remain the same:
—Carefully reported enterprise stories
—Data Visualization in R, Tableau (or Flourish)
-R Data Analysis with a basic chart
—Major Data Maintenance, Restructuring for Site
The R and Tableau instructional material will be delivered like before. I will count on you guys to work through the exercises and gain basic competency.
Teams Has Two Standing Documents Posted for Daily Tasks and Story Pitches, which I will discuss below
Who Are We?
I have done a few media interviews since the announcement Thursday. First, the announcement was met by a great deal of enthusiasm in the community, among senior faculty and the like.
A few questions came up.
–Will we have the opinions that Misty offered in her Twitter feed? My response is Misty came to her opinions after some very in-depth reporting. My class is oriented towards the Associated Press style of reporting, hard news and analysis. Our posts will be rooted in-depth reporting that is transparent to the readers.
–How do we respond to people who ask us Covid-19 questions? If you have the answer and can supply the source, then reply, noting you are speaking for yourself. If you are in doubt, say you don’t know and that you’ll raise the issue for the class.
This should serve as a reminder that everyone in this class will be in a social media fishbowl. People may troll us and look in our social media feeds to find things to embarass the project. Keeping your personal social media feeds professional will be important. This is one thing all journalists need to acknowledge.
Tasks To Do
See document in Teams Suggest a task here. put a comment on it and tag me. If I agree, I’ll add it
Story Pitches
See document in Teams. If you have a story idea, even if it is not fully developed, put it here and the class can examine and discuss it.
_________________________________________________________________________________________
Exercise
Arkansascovid.com Tableau Lesson #2 –Notes are in Teams
https://wordpressua.uark.edu/datareporting/tableau/
Basic R exercise
–Left click on the link below, remove .txt extension, save as all files
https://github.com/profrobwells/CovidFall2020/blob/master/Exercises/Intro%20to%20R%208-20-2020.Rmd
Second option: If that isn’t successful, then go to the https://raw.githubusercontent.com/profrobwells/CovidFall2020/master/Exercises/Intro%20to%20R%208-20-2020.Rmd –Left Click on document –Save as ALL FILES. REMOVE .TXT extension
Third option: Download "Intro to R 8_20_2020.rmd from your browser directly. https://github.com/profrobwells/CovidFall2020/tree/master/Exercises
Ch 1 & 2 of Machlis: Key Points
Reproducible research
Repetitive tasks in modern newsrooms. Employment reports, crime stats, budgets
Variables - an R object
Assignment operator <-
Case sensitive
Vector: A vector can only have one type of data - all integers, all strings
Dataframe - like a spreadsheet
Save files - Don’t save workspace: because all of your variables will be stored and re-loaded the next time you launch RStudio. It’s too easy to forget about previously stored variables that can interfere with later work,
Software packages: tidyverse, rio, pacman
Data Types and R
Machlis: 2.4.2 Data types you’re likely to use often
_________________________________________________________________________________________
Three specific questions for Misty Orpin, our guest speaker Wednesday. Ask about the site’s workflow, her tone and voice on Twitter, her interactions with the public on social media, her concerns about the state data, or something else of significance. Post your questions in Teams
What Is Wrong With This Picture?
Examine this graphic and the data.
Consider the news reporting on Covid-19 trends in Arkansas and issues with data. Tell me what you think are the issues with this graphic. Post your comment on Teams. One paragraph or so.
Cohen, "Numbers in the Newsroom," Common Mistakes.
Machlis. Ch 4: Import Data into R
Basics of Data Analysis (On Blackboard)
_________________________________________________________________________________________
Agenda
Guest speaker: Misty Orpin, Arkansascovid.com
For Kendal, Obed, Quincy: Tableau Skills Part 2, Sept. 2, 2020 Work on this exercise on calculated fields. Make the assigned graphcs, insert into a Google Doc. Put that in your weekly memo due Saturday.
https://wordpressua.uark.edu/datareporting/tableau/
For Mary and Abby:
Look in the Machlis book about how to navigate R Markdown files.
Load this tutorial in your R Studio and run it.
Contact me with questions.
Introduction to R
--Left Click on document
--Save as ALL FILES. REMOVE .TXT extension
For Katy:
Continue testing Arkansascovid scripts
Work with Wells on GitHub account
Machlis, Ch. 6: Beginning data visualization
Video: Basic Charts in R
https://www.youtube.com/watch?v=1EUJ0tsVsUA&t=12s
Refresher: Numbers in the Newsroom (On Blackboard)
Happy Labor Day! No Class
Agenda
Weekly memos
Story pitches. Due Saturday with weekly memo
Review Task List on Teams. Tweets
Editing the Tweets. Have the Image Match the Narrative
Volunteer editor W-T-F (pardon the pun)
Audience Engagement. Read the Feed! Twitter Response Hour.
Continue with R tutorial
Exercise: Loading Data from U.S. Census & Student Loans
top_n function makes life easy
Sorting
Weekly memos
Please supply links to documents and data you worked on.
The memos will be my resource for finding stuff. Help me find things
Story Pitch for Assignment #1. Due With Weekly Memo, Saturday
Story Pitch for this assignment. 200 Words on your idea, list 3 sources and the data you plan to use.
Assignment #1: (Due Sept. 23): Managing Data / Static Graphic
Static Graphic - Managing Data in R / Tableau.
Students will use R Studio to gather, analyze and visualize Arkansascovid data and report and write a 600 word story.
Exercise
Downloading Data 9-9-2020.Rmd
Check what software packages are running: Global Environment
Navigation:
^ + shift + 8 = Zoom to Environment
AP Style on numbers: AP Numerals Entry.pdf
Notes
--The pie chart focuses the reader on large percentages, and encourages the reader to think of the total
--The stacked bar plot provides the same information, but makes it easier to accurately determine at a glance how large each group is out of the whole.
--This bar chart splits the categories horizontally, and draws attention to how the family members are ordered. It encourages the reader to think about the distribution rather than disconnected categories, and gives a better sense of sense of scale.
Wong, Dona M. The Wall Street Journal Guide to Information Graphics.
Ch. 2: Chart Smart
Grammar of Graphics
http://vita.had.co.nz/papers/layered-grammar.html
Agenda
Teams Housecleaning: Master Work Document
Twitter: Voice, engagement, training wheels: sign up
Twitter schedule this week
Review Task List - Sign up for Tweets
2 pm call Gavin Lesnick and Data Format Changes
Data - R quick viz on publication
GitHub data
-GitHub Site for Arkansascovid
https://github.com/Arkansascovid/Main
-2 pm call Gavin Lesnick and Data Format Changes
https://docs.google.com/document/d/1pLIqBGnEiYLVZg48qVDrm3Cajp4jiTwgxrWCKStaJwg/edit?usp=sharing
Maps in Tableau
New County Data
https://prod-useast-a.online.tableau.com/t/datareportinguofa/views/PCR-AntigenMAPS9-14-2020/PCR-Antigen?:origin=card_share_link&:embed=n
New county data https://raw.githubusercontent.com/profrobwells/CovidFall2020/master/new_county_data_9_13_2020.csv
Also here: https://docs.google.com/spreadsheets/d/1jILA2AQaSt36RsBPJLoAGAY_bpKd2UK0wWsXlXkS2cQ/edit#gid=1515682069
Basic maps for antigen and PCR testing today.
Create a new dashboard using the Homepage Numbers as a template
-Discuss
Wong, Dona M. The Wall Street Journal Guide to Information Graphics. Ch. 2: Chart Smart
Baselines p 52
For Bar Charts, Zero Baseline! p 65
Two-Thirds of Graph to Display Line
Increments - not in 3s or odd numbers. p 53
No More Than Four Lines: Simplicity
Panel of Chart. p 55
Labels, avoid Legends. p 57
Two Variables. Use % Change for 2nd Series. p 59
Comparable Scales For Two Charts. p 61
-Read our reader feedback on the new site. Draft a response to this person. We’ll discuss and then send one
See “Stacy Robinson Your Post on Twitter” in Files | Teams
https://bit.ly/32tw7Q1
-IRE Conference Schedule
Identify two events you plan to attend and tell the class why
Wong, Dona M. The Wall Street Journal Guide to Information Graphics.
Ch. 3 & 4: Ready Reference and Tricky Situations
Prof. Alberto Cairo on Data Visualization
https://video.uark.edu/media/Alberto+Cairo+Data+Viz/0_10pblm1b
Agenda
Assignment #1 Pushed to Sept. 23
Second Eyes on Tweets - Editing Help
Access to Google Drive and WordPress
Workshop
Assignment #1 Progress
Assignment #1: (Due Sept. 23): Managing Data / Static Graphic
Static Graphic - Managing Data in R / Tableau.
Students will use R Studio to gather, analyze and visualize Arkansascovid data and report and write a 600 word story.
Workshop
Time to Divide and Conquer The Task List
Latest Details in the Task List
Katie join the new data with the old and document it
Kendall fix the mobile Wordpress links
Obed pull together the Spanish workbooks for the state update. Then the counties
Abby prepare text snippets to run out at 2 pm from the tweet slide. Look at new data releases and come up with additional snippets
Mary fix the demographic data and join it to the new version. Build Tableau templates
Wong, Dona M. The Wall Street Journal Guide to Information Graphics.
Ch. 5: Charting Your Course
Albert Cairo, "The Functional Art," Principles of Data Visualization.
1) IRE Conference Schedule, Identify two events you plan to attend and tell the class why.
https://www.ire.org/events-and-training/event/4125
2) White House coronavirus task force report on Arkansas. Describe two key points that struck you as interesting and worth a story, and explain why in a few paragraphs.
https://www.documentcloud.org/documents/7204533-Arkansas-9-6-20.html
3) Describe your weekly tasks, including status of pending projects, and put in links to your work
Agenda
Progress report on stories
IRE Conference Schedule - what are you attending. Take notes, grab tipsheets
Strategic Plan
Divide and Conquer List
Context and analysis
Nicole Clowney, guest speaker Wednesday
-Discuss
Context and Analysis
--Hutchinson statements: Gov. Asa Hutchinson @AsaHutchinson. There are 549 new COVID-19 cases in Arkansas. With over 8,800 PCR tests performed yesterday, we’re on track to meet and exceed our testing goal for this month. Continue to wear your mask to protect your friends and family.
--Static vs active: antigen numbers vs difference in tests.
Document obtained by the Center for Public Integrity.
Wong, Dona M. The Wall Street Journal Guide to Information Graphics. Ch. 2: Chart Smart. p 70->
-For Wednesday
1) Questions for Arkansas State Rep. Nicole Clowney. https://www.arkansashouse.org/district/86 Twitter: @NicoleClowneyAR
2) **Assignment #1: (Due Sept. 23, 11:59 pm): Managing Data / Static Graphic**
Agenda
Guest Speaker: Arkansas State Rep. Nicole Clowney, D-Fayetteville.
IRE Conference
Dplyr boot camp!!!
There is nothing else. Focus!!!
Assignment #1: (Due Sept. 23): Managing Data / Static Graphic
Guest Speaker: Rep. Nicole Clowney
Assignment #1: (Due Sept. 23): Managing Data / Static Graphic
Static Graphic - Managing Data in R / Tableau.
Students will use R Studio to gather, analyze and visualize Arkansascovid data and report and write a 600 word story.
DPLYR
DPLYR BOOT CAMP 5th Lesson 8-21-2020.Rmd
Notes: How Do I?
https://smach.github.io/R4JournalismBook/HowDoI.html
Machlis
Ch. 13 Date calculations
Dealing-with-dates.pdf by Andrew Ba Tran
Agenda
Update on stories
Fact check process
R - ggplot with hospital data
Mary and mobile formatting for Tweets
IRE sessions discussion
Read the Feed
Discussion: Fact Check
https://docs.google.com/document/d/1ni0pLjAmr2XhEUgIV6XWhUiaQhsK7FXHDwa76hjFxlM/edit?usp=sharing
Exercise: Hospital Data GGLOT 9.28.2020.Rmd
-Wong, The Wall Street Journal Guide to Information Graphics, Ch. 3 and how that will apply to your Tableau work.
-Transforming and Analyzing Data dplyr.pdf, Andrew Ba Tran, Washington Post: https://github.com/profrobwells/CovidFall2020/blob/master/Reading/Transforming%20and%20Analyzing%20Data%20dplyr.pdf
-Reader feedback: Consider a response to this person. See “Stacy Robinson Your Post on Twitter” in Files | Teams https://bit.ly/32tw7Q1
Agenda
Partnership Updates
Spanish Workbooks -Rachel Sanchez-Smith
R - ggplot with hospital data
White House Report
Mary and mobile formatting for Tweets
IRE tipsheets - put your stuff on the Master Work document
Fact Check: Obed, Kendal
Read the Feed
Partner Updates
Biomedical Engineering Dept. Collaboration for Data Science
Arkansas Nonprofit News Network - Benjamin Hardy
University of Arkansas Humanities Center
White House Report
Latest White House Coronavirus task force report
Arkansas had 194 new cases per 100,000 population in the last week, compared to a national average of 93 per 100,000.
https://www.documentcloud.org/documents/7219580-Arkansas-9-27-20.html
Exercises
Exercise: Updated Hospital Data GGLOT 9.28.2020.Rmd
https://raw.githubusercontent.com/profrobwells/CovidFall2020/master/HospitalData_GGPLOT_9282020.Rmd
Post in Teams a Hospital admits / vents slide for Sept. 29 using the code from today
Multiple variable in a graph
Geom_Line, Geom_point, Geom_bar
How to alter the colors in a chart.
Tutorial: Graphing Introducton Jan 11 2020.R
A handy explanation of ggplot and its components
If you’re using ggplot: plus it!
For everything else: pipe it!
aes - reorder in ggplot
geom_point()
geom_bar()
geom_boxplot()
Reference for Today’s Lesson
Ch 5, Machlis
Export Static chart
Data Visualization
ggplot2 - charts and maps
> [**GGplot Video from Andrew Ba Tran**](https://www.youtube.com/watch?v=Sx7d7eGRSj0&t=9s){target="_blank"}
–
**Dplyr Presentation*
Five basic verbs filter() select() arrange() mutate() summarize() plus group_by()
Pipes - a Much-Used Command to Link Filters, Functions
pipe %>% CMD + Shift + M
Pipes are a way of chaining commands.
object %>% operation() —> result
Presentation from Bob Rudis on Writing Readable Code with Pipes, delivered at the rstudio::conf 2017.
https://www.rstudio.com/resources/videos/writing-readable-code-with-pipes/
Key Concepts - Moving Forward:
Dplyr: Filters, Grouping, Sorting, pipes %>%
Pipe shortcut = CMD + SHIFT + M Basic data visualization
Tidyverse
IRE tipsheets - put your stuff on the Master Work document
Machlis: Ch. 8 Analyze data by groups
Dplyr - Andrew Ba Tran - pipes-dplyr.pdf
Machlis: Ch. 9 Graphing by Group
Visual Narrative Tricks by Albert Cairo
https://www.youtube.com/watch?v=TSGaueL4Ggk
Agenda
Weekly Tasks
About Page, Podcast Page
Fact Check
Website Usage
Flourish
Nicole Clowney on Wednesday
–Fact Check
Story —> Fact Checker
Heavner-PCR —> Zimmardi
Lamy-restaurants —> Heavner
Hennigan-Hispanics —> Seiter
Seiter-Nursing Home —> Hennigan
Zimmardi Deaths —> Lamy
–We Need to Turn This Around
Exercise:
How to Create a Flourish Chart
Create a Flourish account
https://app.flourish.studio/login
Import master_file.csv or your relevant data source (hospitals, schools)
Build Charts:
--Mary, Homepage numbers, then Demographics
--Katy, Statewide Case Data. Start with map with Active Cases by County
--Kendal, Statewide Death Data, start with Total Deaths
--Abby, Total cases by School District
--Obed, Statewide Hospitalizations, start with Hospitalizations Over Time
Overview
https://www.youtube.com/watch?v=fKO_jjqgooc&feature=emb_title
Ravi Brock and Flourish in 9 mins
https://video.uark.edu/media/Flourish+tutorial+with+Ravi+Brock/1_g6nzw3h0
Cairo and Flourish - Overview
https://www.youtube.com/watch?v=cN1Q9MusZbc
Cairo - Flourish - Stories
https://www.youtube.com/watch?v=P7AmUdSBOVU
Agenda
Nicole Clowney
Flourish
Fact Check
-Happy Fall!
Abby:
https://public.flourish.studio/visualisation/3936735/
https://public.flourish.studio/visualisation/3941284/
Kendal:
-Finish Your Flourish Workbooks by Friday
Flourish - Stories
https://www.youtube.com/watch?v=P7AmUdSBOVU
Intro
https://help.flourish.studio/article/9-creating-a-visualization
Bar chart race template
https://towardsdatascience.com/step-by-step-tutorial-create-a-bar-chart-race-animation-da7d5fcd7079
https://help.flourish.studio/article/63-how-to-change-label-positioning
Stories in Flourish
https://www.youtube.com/watch?v=9HTZUXNOLVQ&feature=emb_rel_end
Importing Data
https://www.youtube.com/watch?v=Rscfi7QZVvs&feature=emb_rel_end https://help.flourish.studio/article/12-adding-data-to-a-template
Misc
Google sheets https://help.flourish.studio/article/165-how-to-pull-through-data-from-a-google-sheet
Story basics
https://www.youtube.com/watch?v=9HTZUXNOLVQ&feature=emb_rel_end
Content to popups
https://help.flourish.studio/article/69-how-to-add-custom-content-to-your-popups
Basic Help page
Overview
https://flourish.studio/2019/12/19/2019-year-in-review/
Spanish https://flourish.studio/2019/10/23/informar-elecciones-con-flourish/
Different accounts https://help.flourish.studio/article/16-controling-access-to-visualizations-and-stories
Sorting data https://help.flourish.studio/article/36-how-to-display-your-data-in-a-different-order
Agenda
Flourish
White House Task Force Report
More Flourish
Story Updates
Task List
Exercises
Katy - Maps
Abby
Connect to live data
Templates Drive The Bus
Build Charts:
--Mary, Homepage numbers, then Demographics
--Katy, Statewide Case Data. Start with map with Active Cases by County
--Kendal, Statewide Death Data, start with Total Deaths
--Abby, Total cases by School District
--Obed, Statewide Hospitalizations, start with Hospitalizations Over Time
Alberto Cairo, Flourish Tutorials, Parts 1-6
Flourish Workbook Updates
Discuss Story Pitch #2 - Use This Form:
https://docs.google.com/forms/d/e/1FAIpQLScws1-wOhgQ7DV4MhGNoX8QbJFdX6LH91oY6NBv76GtRzTBTA/viewform
Agenda
Flourish Workbook Updates
Daily Task List
Discuss Story Pitch #2
Translated story
Demographic Data Adventure
Slicing and Dicing Exercise
Exercises
Using R to Shorten, Simplify the Master File
Resources
**Dplyr Presentation*
Five basic verbs filter() select() arrange() mutate() summarize() plus group_by()
Flourish Workbook Updates
Machlis: Ch. 8 Analyze data by groups
Transforming and Analyzing Data dplyr.pdf, Andrew Ba Tran, Washington Post:
http://learn.r-journalism.com/en/wrangling/dplyr/dplyr/
250 Words with Sources: Story Pitch #2
Students will gather, analyze and visualize Arkansascovid data, primarily with R Studio. Students will produce publication-ready graphics in Flourish from data. Stories will be footnoted and ready for fact checking. R script due with story. Write the story Google Doc, Not Word. Make sure sharing is enabled.
Stories Due Saturday, Oct 24
Schedule
Demographic data
Flourish
R exercises
R Exercises Due Wednesday
https://docs.google.com/spreadsheets/d/1gvm3BuS83NRbsx0pS_PuYTFytN4sPS3B9Xhq-6gnu6c/edit#gid=0
Abby training on Data Update
–Exercises
Using R to Shorten, Simplify the Master File
You are in Dplyr bootcamp!
DPLYR BOOT CAMP 5th Lesson 8-21-2020.Rmd Make a Dplyr Cheat Sheet
Joining Dataframes in R
https://www.youtube.com/watch?v=gLg4D9bMIyc&t=13s
Ch. 7 Two or more data sets
Data Wrangling
http://learn.r-journalism.com/en/wrangling/
Joins in R:
Agenda
White House Report
Stories Due Saturday, Oct 24
R Tables
Flourish
–Exercises
Using R for Math
Agenda
Stories - Wednesday update on reporting.
Flourish Update
School Data Calculations
Weekly Data Calculations
R Tables Exercises
Tweet Themes for This week
Thanksgiving Week: Volunteer for Black Friday
Big Picture
We have achieved significant goals so far.
--Automated data gathering
--Converted home page to mobile
--Built new sophisticated analytical tools
--Published five stories, two in Spanish
--We are getting four stories from Schulte's class
Where we are going.
--Heading into a more stable phase
--More strategic work on daily tweets
--More on weekly and monthly trends using R
--More on schools, nursing homes, Hispanic community
–Stories from Schulte’s Class
In final edit and ready for fact check
Hennigan on a family's struggle:
Alexus Underwood on nursing homes:
https://alexusfeatures.wordpress.com/2020/10/12/nursing-home-restrictions-during-covid-19/
Requiring additional work but I will figure a way to publish these:
Andrew Watson and Marshallese
Breanna De Leeuw and Marshallese
https://ontheddotl.wordpress.com/2020/10/19/marshallese-community-pushes-on/
–Tweet Themes
What are the daily stories we will see this week and how can we prepare for them?
Last week, COVID-19 sidelined the governor of Arkansas from public actvities.
--Hospitals
--Teacher deaths
--School outbreaks
--Weekly trends
–Workshop
How to streamline the Flourish updates
Katy and Flourish update process
Abby and Flourish: compress charts
Wells/Zimmardi: How we calculated school COVID data
Wells: How we calculated weekly trend data
–Exercises
Agenda
A gentle introduction to APIs for data journalists:
Max Harlow Presentation on How to Use GitHub
Sign up for a census key:
https://api.census.gov/data/key_signup.html What is an API?
Agenda
Story Updates
Katy's FOIA Gold Mine
Flourish Update
Math with R
Exercises
Using Lubridate
Katy: Tune up daily update slides in R script. Then FOIA
Today's Exercise - A Big Update to this file.
Download:
“Connecting the Dots” by Jacob Harris (2015) and discuss how people should or should not be represented through news visualizations.
A gentle introduction to APIs for data journalists:
Max Harlow Presentation on How to Use GitHub
Sign up for a census key:
https://api.census.gov/data/key_signup.html What is an API?
Agenda
Story Updates
La Prensa Libre
R Exercise
Top_n
List of Days of Most New Cases
https://app.flourish.studio/visualisation/4007988/edit
Data is topcases.csv
Values: Date, New Cases Today (column D)
--R table automatically selects the top 10.
--no filter
Exercises
Using Lubridate
Katy: Tune up daily update slides in R script. Then FOIA
Today's Exercise - A Big Update to this file.
Download:
Machlis Ch. 12 Putting it all Together: R on Election Day
Agenda
Storytelling with data
Story updates
Continue R exercises
Schools and demographics
Exercises
Schools and demographics
https://raw.githubusercontent.com/profrobwells/CovidFall2020/master/School%20Data-abby.Rmd
Using Lubridate
Katy: Tune up daily update slides in R script.
Today's Exercise - Compiling Data by Dates. Due Saturday with Weekly Memo.
Download:
Tidy text mining
https://www.tidytextmining.com/tidytext.html#
Bad data visualizations. Data Translation.
The Journalist as Programmer: A Case Study of The New York Times Interactive News Technology Department
http://isoj.org/wp-content/uploads/2016/10/ISOJ_Journal_V2_N1_2012_Spring.pdf
What is code?
http://www.bloomberg.com/graphics/2015-paul-ford-what-is-code/
Agenda
Grant update
Story updates
Biomedical Engineering and Spring class
Calculations in R
Red vs Blue Covid election results map
Exercises
Sicing and Dicing:
Red vs Blue Covid election results map
https://apnews.com/article/counties-worst-virus-surges-voted-trump-d671a483534024b5486715da6edb6ebf
Julia Angwin, Terry Parris Jr., Surya Mattu. “Breaking the Black Box: What Facebook Knows About You,” ProPublica, 2016;
Nicholas Diakopolous, “Algorithmic Accountability,” Digital Journalism, 2014.
Agenda
Story Updates
Graphics - ggplot
Red vs Blue Covid election results map
Quick Charts in Daily Script
How We Did This: Daily Update Script
https://raw.githubusercontent.com/profrobwells/CovidFall2020/master/Daily%20Update.Rmd
We will build this today
GGPLOT Cookbook
New Fun with Sicing and Dicing
https://github.com/profrobwells/CovidFall2020/blob/master/Exercises/Slicing-Dicing%20Data%2010-14-2020.Rmd
--Follow this tutorial
https://guides.github.com/activities/hello-world/
Data Visualization Basics
Load tutorial: Basic Data Visualization 12-26-18.R
Agenda
Wells limited availability this week
More on data update- Tweet the slides
Tweet Person Look for Something Different: Analysis
Abby data update
WordPress
Graphics
Exercises
We will build this today
GGPLOT Cookbook
New Fun with Sicing and Dicing
https://github.com/profrobwells/CovidFall2020/blob/master/Exercises/Slicing-Dicing%20Data%2010-14-2020.Rmd
Agenda
Twitter Boss: Mary & Katy
HootSuite
Flourish Animation
HootSuite
HootSuite Overview
https://help.hootsuite.com/hc/en-us/articles/115010088387-Create-and-publish-posts
https://play.vidyard.com/YLq7L7A86xBuNgGogYrwxH.jpg?
Instagram issue
Exercise
Katy’s Chart
https://app.flourish.studio/visualisation/4380046/edit
Tutorial
Details on Line Chart Race
https://help.flourish.studio/article/75-line-chart-race-an-overview
Details of a Bar Chart Race
https://help.flourish.studio/article/44-bar-chart-race-an-overview
Attempt at a Bar Chart Race by County on New Deaths
--Formatting Data to Match Flourish Requirements.
--Pivot Table
https://app.flourish.studio/visualisation/4379103/edit
Your Turn:
Build Cases Per 10,000
Build Demographics Over Time
Build Counties Over Time
--Follow this tutorial
https://guides.github.com/activities/hello-world/
--APIs - basics
–Maps: Adapt Machlis - Maps in R Ch 11 exercise for Median Income in Arkansas
Agenda
Class Schedule for Rest of Semester, Final Week
Story #2 Due Tuesday
Assignment #3 Due Sunday
Data Wrapper
November Wrapup
Daily Work Schedule - Mary and Katy
Option 1) Election - COVID Map in Datawrapper.
Students will use R Studio to compile data presidential election returns by county and compare this to the county’s COVID-19 rate. The map will be visualized in Datawrapper. A short 250 word story will describe your findings. Data dictionary required. Post Google Doc with story and link to Datawrapper map on Blackboard, 11:59 p.m. Sunday
OR
Option 2) Article
Write an 600-word article and graphic on a COVID-19 topic of your choice. Pitch is due Tuesday, 5 p.m. on Teams
Datawrapper
Exercises
Create Monthly Totals for November
Create a New R Markdown File for November Totals
Use this Slicing and Dicing Exercise as a Template
Monthly Totals for Hospitals, Deaths, New Cases
Visualize in Flourish
Post your code on Teams
Machlis: Ch. 14 Integrate R With Your Storytelling Using R Markdown
Agenda
Election Map Workshop
R Markdown
Maps
Github
Work Schedule Update
Work Schedule
The person doing the data update will be handling the data. I’ll deal with the tweets or designate someone that day to handle them. But I want the data person just to deal with data going forward.
Option 1) Election - COVID Map in Datawrapper.
Students will use R Studio to compile data presidential election returns by county and compare this to the county’s COVID-19 rate. The map will be visualized in Datawrapper. A short 250 word story will describe your findings. Data dictionary required. Post Google Doc with story and link to Datawrapper map on Blackboard, 11:59 p.m. Sunday
OR
Option 2) Article
Write an 600-word article and graphic on a COVID-19 topic of your choice. Pitch is due Tuesday, 5 p.m. on Teams
Workshop: Election Map
Datawrapper
Agenda
Story Updates
GitHub
Course Evaluation
Mary and Katie: Discuss Methodist Family Health Zoom Call, 8:45 am Wed.
Important!
Course Evaluation
Please do me a favor and evaluate this course.
It's important to me and the department to get your thoughts
on what worked and what did not.
If you think it is important,
then please take five minutes to fill out the survey.
Example: Updating Class GitHub https://github.com/profrobwells/CovidFall2020
Arkansascovid.com GitHub https://github.com/Arkansascovid/Main
Exercises: Basic GitHub
This class is intended to teach you modern workflow techniques for coding. A centerpiece of that workflow is GitHub. This is a website with a system that allows you to collaborate with other programmers on coding projects. It manages versions of software code and is a very popular with the tech elite.
Your GitHub account, which is public, represents an important professional image. Prospective employers and collaborators will look at your GitHub account.
Create a GitHub account
Exercise #1: Hello World Setup
https://guides.github.com/activities/hello-world/
Exercise #2: GitHub flow
https://guides.github.com/introduction/flow/
Help Files
https://help.github.com/en/desktop
Exercise #3: Pull Request
https://docs.google.com/presentation/d/1MbltRcOerktc-E26HMDjYj0BO9CTubQWu1Z2bB9CpVY/edit#slide=id.g448ccc227721fe56_10
See the above link, Max Harlow on How to Use GitHub
1. Create a test repository, call it "Junk"
2. Commit copies of a random R script and a random text .txt file
3. Pair up with a buddy. Follow them on GitHub
4. Fork their Junk repository
5. Report an Issue on their repository. See slides #47-53
6. Create a pull request on their repository. See slides #54-68
7. Resolve the issues and pull request for your own GitHub account.
8. Revel in your nerd powers. Watch Star Trek reruns. Eat Pringles.
GitHub Resources
Basic GitHub 4-22-19.R
https://bit.ly/2UAMGTd
Another GitHub guide
https://andrewbtran.github.io/NICAR/2018/workflow/docs/03-integrating_github.html
Simplified GitHub- GitHub Desktop
https://help.github.com/en/desktop
Setting up an R Workflow
http://learn.r-journalism.com/en/publishing/workflow/r-projects/
Advanced: Set Up GitHub in an R Project https://support.rstudio.com/hc/en-us/articles/200532077?version=1.2.5033&mode=desktop
Agenda
Twitter API
R Markdown
Wrap up
Datawrapper
Datawrapper & flourish advanced training
From Adam Marton, Capital News Service and the Howard Center:
I love both those programs. They really allow you to do nice charts, graphs and maps quickly. For me, that means more time for complex data viz and less time in Illustrator or D3.
Datawrapper has more baked in map shapes. I tend to generally prefer Flourish these days because they have more customization options and visualization types, but they are both great. It took me a semester or two to get my head around which to use in which situation. Some resources below:
Here is some datawrapper training I did for CNS. Datawrapper (the company) has used these videos as well! https://www.youtube.com/watch?v=bmGgzBKcK_M&t=3132s
Here is a link to a doc I keep of custom Flourish visualization templates that I like. https://docs.google.com/document/d/1DSly1_5sk93bc3xhyBu0zGWtmKaX9N_SYeyXesBj5ho/edit
I am going to forward you a message from a journalist at the Philadelphia Enquirer that I was talking to. He does some cool maps in datawrapper and has a great workflow for importing your own map shapes. It is worth holding onto for the future.
Let me know how I can help!
-Adam
Adam:
Datawrapper has good instructions on how to upload your own map here: https://academy.datawrapper.de/article/145-how-to-upload-your-own-map
The general process, or at least the workflow I use, is:
From here, the process to finish your map is identical to creating any other Datawrapper map.
To go even further with a custom map, you can upload a TopoJSON that has a polygon layer for places like precincts and municipalities, and an outer boundary line layer for things like county or state boundaries. For instance, this election map has 240 municipalities in the much-discussed Philadelphia suburbs, as well as 60-plus wards for the city of Philadelphia. It also needed to have thicker lines to show county boundaries as a reference point for readers, or else all the polygons would be one large blob. Here’s an example using another election map: https://www.inquirer.com/politics/election/philadelphia-presidential-votes-suburban-county-ward-map-20201109.html
You start by uploading a multi-layer TopoJSON. To do that, import a polygon shapefile and a line shapefile into mapshaper.org. Make sure you apply the correct projection and set the interior points for both layers. After you export as a TopoJSON and bring it into Datawrapper, hit the toggle switch for Additional options (advanced). “Regions” are your polygons and choose “Outer borders” to set the thick boundary lines.
Anyway, that’s a new Datawrapper trick that I just picked up for our election maps.
John
R Markdown:
Video:
https://video.uark.edu/media/R+Markdown1/0_k08o7izd
Twitter analysis of Trump Tweets
http://varianceexplained.org/r/trump-tweets/
Twitter historical API
https://developer.twitter.com/en/docs/tutorials/choosing-historical-api
Congratulations!
We covered a lot this semester
Twitter API
Twitter Scraping
Amy Webb future of journalism trends
Google search tips
https://blog.expertisefinder.com/top-6-google-search-tips-for-journalists/
Artificial intelligence in the news
Sharon Machlins Nicar compilation site
http://www.machlis.com/nicar19.html
Seth C. Lewis, et al. “Big Data and Journalism: Epistemology, Expertise, Economics and Ethics,” Digital Journalism, 2015
Exercises: GGPLOT Cookbook
Data Visualization Basics
Load tutorial: Basic Data Visualization 12-26-18.R
Adapt Machlis - Maps in R Ch 11 exercise for Median Income in Arkansas Maps in R March 24 2019 - Using the Census API https://bit.ly/2FvZDrB
Building a Census tract map
Map Demo: Interactive Map.R
Adapt Machlis - Maps in R Ch 11 exercise for Median Income in Arkansas
![You will make this in the class]TK TK Images/ARmed_income3-22-19.png
Assignment Proposal: Interactive Map.
Students will use R Studio to build interactive maps of Arkansascovid occupational data in Arkansas. Results will be posted on GitHub. Data dictionary required
tran mapping-census-data.html
TK TK TK Test Canvas: Excel Quiz
Quiz - Math and R Splitting Hashtags 2-25-19.R
https://bit.ly/2BQIE2i
Extracting Text Strings from data
Splitting Hashtags 2-25-19.R
R Markdown to distribute findings to Stanford, Feb 2020
https://profrobwells.github.io/HomelessSP2020/SF_311_Calls_UofA.html
Machlis Chs. 17 & 18
Census Reporter to look up tables
Simple Web Scraping Continue With Text Mining Turn your R cheatsheet into a PDF Turn your R cheatsheet into a web page on GitHub Coulter Bigrams - Score
TK TK Images/Coulter bigram score.jpeg)
Kavanaugh text mining story: Text analysis of Brett Kavanaugh’s opinion.
http://www.storybench.org/bringing-textual-analysis-tools-to-judge-brett-kavanaughs-latest-opinion/
Text Mining with R
https://www.tidytextmining.com/ Ch. 1 The Tidy Text Format Ch. 2 Sentiment Analysis
AOC-Coulter Data Mining Exercise:
Coulter Tweet Analysis #2 Exercise
https://bit.ly/2TdY9Mv Key to the exercise:
https://bit.ly/2F3brlh
Text Mining with R
https://www.tidytextmining.com/ Ch 3 Analyzing word and document frequency: tf-idf
Web Scraping in R: Simple Web Scraper
Machlis Ch. 15 Simple Web scraping
We will build this
TK TK Images/Income by Census Tract Washington Co.png)
Video of Machlis mapping
https://www.youtube.com/watch?v=HFJOV5XaU_U See R script: Maps in R March 24 2019 https://bit.ly/2FvZDrB
Maps Andrew Ba Tran - Mapping https://nicar.r-journalism.com/docs/ http://learn.r-journalism.com/en/mapping/
Data Cleaning
Disaggregating variables for summation
Mapping
APIs
Census API Exercise
tran mapping-census-data.rmd
Mapping Demo: Map Demo 6th Lesson 8-22-2020.Rmd
Mapping Exercise from Machlis book, Ch. 11
https://bit.ly/2VXCSU2 Ch. 11 Maps in R Machlis Ch. 11 Maps in R
Using R for Math
y
–Extra Discussion Lubridate Review Ch. 13 Machlis Review Tran and Lubridate
Assignment1_KEY_StaticGraphic_2_9.R
Dplyr bootcamp.
-Date Corrected Data https://video.uark.edu/media/Date+Corrected+Data/1_2rmzgzwu
StackOverflow
https://stackoverflow.com/questions/46691933/r-sort-by-year-then-month-in-ggplot2
--Review another R tutorial
https://docs.google.com/presentation/d/1zICxR7qDM3RQ2Nxi5CqHlM3H8I7qoVkNtqcNcnbbDCw/edit#slide=id.p
RStudio Navigation Tricks You Might’ve Missed
https://rviews.rstudio.com/2016/11/11/easy-tricks-you-mightve-missed/
How Do I?
https://smach.github.io/R4JournalismBook/HowDoI.html
Functions
https://smach.github.io/R4JournalismBook/functions.html
Packages
https://smach.github.io/R4JournalismBook/packages.html
All Cheat Sheets
https://www.rstudio.com/resources/cheatsheets/
String data manipulation
https://dereksonderegger.github.io/570L/13-string-manipulation.html
Follow StoryBench, Northeastern Univ.
https://twitter.com/storybench
Use R instead of Excel
https://trendct.org/2015/06/12/r-for-beginners-how-to-transition-from-excel-to-r/
Basic data work- head to http://bit.ly/excel_and_r
dplyr
https://github.com/r-journalism/learn-chapter-3/blob/master/dplyr/pipes-dplyr.R
Coulter Tweet Analysis #2 Exercise
https://bit.ly/2TdY9Mv
Twitter exploration exercise
https://bit.ly/2Sqn1j1
Return to Twitter Engagement
http://bit.ly/2GParD5
AOC Twitter feed
https://bit.ly/2Sqn1j1
Discuss Twitter Metadata
https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/user-object
Study Twitter meta data
https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/tweet-object.html
Look at this example: Ocasio.csv
Twitter analysis of Trump Tweets
http://varianceexplained.org/r/trump-tweets/
Show Collins results
Bots
Bot or Not: Difficulty determining a bot on Twitter
An app that uses machine learning to guess if a Twitter account is a bot
https://www.r-bloggers.com/botrnot-an-r-app-to-detect-twitter-bots/
https://mikewk.shinyapps.io/botornot/
Article about Botometer
https://www.vox.com/technology/2018/4/9/17214720/pew-study-bots-generate-two-thirds-of-twitter-links
Stanford research paper on this topic
https://pdfs.semanticscholar.org/e219/6b47133c2191d380098744c13ba77133e625.pdf
–30–