Class 25: New Frontiers
You all have learned new skills in this class: wrangling data with spreadsheets, designing and mocking graphics in Figma, and putting experiences together in raw web code. Hopefully with this class you now see not only how you can put together data stories from scratch, but also how much craft goes into all the visualizations and interactives you see in your every day.
In this class, we will mostly spend time working on final projects, but we will briefly cover how you can take your skills to the next level across these domains.
Data collection
We have covered spreadsheets: formulas, sorting, pivot tables. But there is so much more that can be done with data.
Databases
SQL (pronounced like “equal” but starting with an ‘s’) is a programming language just for querying and filtering data. It can be used to perform advanced searches within a dataset. You can also use SQL to merge multiple different datasets using what’s called a JOIN. This powerful technique allows you to correlate insights across multiple sources of data, e.g. you could uncover the number of homeless shelters per zip code based on the median income of the surrounding neighborhood.
Useful links to learn more:
- Introduction to SQL for Data Journalism (an excellent guide from my former teacher at Stanford, Dan Nguyen)
- Training materials for Data Journalists (scroll down to the SQL section)
Data clean-up
As I’m sure you all experienced, datasets can be messy. Things as innocuous as a space character before a field can throw a wrench in your pivot table. Address columns might use different formats. Some rows can have really obvious errors, like all the columns are offset. Being able to clean up a dataset is a valuable skill, and fortunately people have designed tools to help.
Dataset clean-up tools:
- OpenRefine: the de facto tool for data clean-up. It shines when merging addresses together with different formats or other fields with minor typos. Look up online tutorials for the tool.
- Tabula: Have data trapped in a PDF? Tabula is an ambitious tool that works pretty well at extracting tables that are sealed in PDF documents. It has some flaws but generally outperforms other solutions.
- DataWrangler: a tool for cleaning and transforming data which has now been merged into a commercial product called Trifecta. The free version should still be available if you consult the linked website.
Design
There is so much you can do with design, and we have just scratched the surface with our explorations in Figma. Design is an art and a science, and the levels of mastery that can be achieved are limitless. In class we have viewed the craft of design as an iterative process, starting from sketches, moving towards mocks and prototypes, and ending with finished products, be they visualizations or websites. This is not the only approach – in fact, I went with it simply because it matches with my own process. But different techniques work for everyone.
Sketching
The art of scratching some quick designs together on paper never gets old. It’s how many of the most accomplished designers start their process, and there is a lot more that can be done other than pencil and paper.
Ways sketching can be enhanced:
- Use other drawing tools, from colored pencils and highlighters (to give your sketches a splash of color) to sticky notes to full on paint brushes. Everyone has a different process. Do what works for you.
- Check out various drawing apps. If you have an iPad or other tablet there may be some useful apps. A particularly famous (and full-featured) drawing app is Procreate for the iPad. You can also invest in a drawing tablet that plugs into your laptop/computer.
Mocking and Prototyping
Once you have a solid idea of how things will look, it’s time to put that in place by creating design mock-ups or prototypes. This is like a sketch but in your favorite graphic design software, e.g. Figma. Many designers skip the sketch part entirely and just start here.
Useful tools for prototyping:
- Balsamiq: Software for quickly wireframing concepts. This software specializes in that it intentionally restricts your design to be a comic-y, sketched style, which is useful for not worrying too much about design elements.
- Other wireframing tools: I’ve only used Balsamiq so I can’t vouch for other solutions, but this list might point you to some useful tools.
Graphic design
Once you have a good idea of what you want to create – be it with sketches, mocks, or just a well thought-out vision – it is time to actually make it. In class we looked at Figma, which is what I primarily use because it is free but still professional, and it offers online collaborative features. There are lots of other tools you can use, too, each with their advantages.
Design tools:
- Adobe photoshop: The industry-leading photo editing software
- Adobe Illustrator: nice software for making vector graphic illustrations
- Adobe XD: prototyping design tool similar to Figma
- Sketch: A Mac OS-only app that is widely used by designers. Another alternate to Figma
Coding
Even though we did a surface-level dive into coding, we still covered a lot of ground. We looked at three different languages – HTML, CSS, and JavaScript – which power the modern web. Though programming can seem finicky and precise, it offers near-unlimited potential. And once you get good at just one or two programming languages, it becomes really easy to pick new ones up (you will find that different programming languages are more like different dialects of the same common tongue).
Code editors
In class, we used WebWarrior, a tool I custom-built because I wanted to provide everyone with a common tool that worked online and did not require you to have your own laptop. But most programmers don’t use WebWarrior. They use something called an IDE, or integrated development environment. An IDE is essentially a code editor, an application you install on your computer that lets you write code and then run it. In that sense, WebWarrior is just an online IDE.
Moving forward with an IDE
- Visual Studio Code: This is the IDE I use. I think it has the nicest interface, has a lot of plugins you can install to customize it.
- Atom: This is a minimalist editor a lot of people use.
- Sublime Text: This is a glorified text editor, but it’s really simple and clean. People like it.
- Brackets: another minimalist editor that specializes in web programming
- Anything else really. Coding is a matter of personal preference. As you experiment with tools and grow as a coder, use what works best for you. Not what some blog or instructor tells you is best.
Programming languages
Web programming is pretty cool. In fact, it’s my favorite kind of programming. But there is much more in the coding world. Like using coding to process data, write desktop applications, scrape websites. We will peek at a few other programming languages and their uses.
Languages:
- Python: This is probably the most popular programming language in the world right now for non-web applications. Python is an easy-to-use and flexible scripting language that can do everything from scientific data processing to downloading and scraping websites.
- R: A programming language geared towards data scientists that is popular among journalists as well. R is able to do high-level statistical analysis as well as produce high-quality graphs.
- C: This is one of the most low-level languages out there, meaning you are “close to the metal.” This is a fancy way of saying that you are only a few steps removed from writing 0s and 1s for the computer to process. Being close to the metal means your code may be longer and harder to understand, but it will run nearly as fast as possible on your computer.
- Other contenders... Go: A low-level language like C but with a lot of convenient functions. Rust: the latest cool language that’s low-level like C; it features strong type-checking to ensure your program does what you think it will. Java: a mid-level language that saw huge popularity 10-20 years ago but is now more used in new projects that run on Java-compatible hardware, like some toasters.
Web scraping
We never covered web scraping in this class, but all the foundation for learning is in place now that you know how things work on the web. In general, I like to use Python since I’m used to it.
Web scraping tutorials:
- Implementing Web Scraping in Python with BeautifulSoup: This tutorial understands some knowledge of how Python works but demonstrates the steps involved in scraping data