It only takes one bad apple to spoil the bunch. Introduction In the previous post I provided an overview of the data acquisition stage of the data pipeline (Figure 1). In this post I’ll provide an overview of the data Quality Assurance and Control (QAQC) stage. Data quality assurance and control is the process ofContinue reading “The Data Pipeline: Data QAQC”
Author Archives: adubak
The Data Pipeline: Data Acquisition
Data Quality Control and Assurance Starts Now! Introduction In the previous post I provided an overview of the database development stage of the data pipeline, and emphasized that database design is largely driven by the data acquisition protocols. In this post I’ll 1) provide an overview of the third stage in the data pipeline: DataContinue reading “The Data Pipeline: Data Acquisition”
The Data Pipeline: Database Development
How data acquisition protocols drive database design. Introduction In the previous post I made the distinction between data stewardship and data management, introduced the data pipeline (Figure 1), provided an overview of the planning stage of the pipeline, and made a call to action for everyone involved in data management to think like a dataContinue reading “The Data Pipeline: Database Development”
The Data Pipeline: Planning
In the previous post we discussed some of the challenges of, and common misconceptions about, data management. In this post we’ll cover the difference between data management and stewardship, an overview of the seven steps of data pipeline, and make a call to action for all of us to think like data stewards when it comes to the data we collect.
Data management?…But I don’t want to!
Why does data management in the environmental sciences seem to be undervalued? It’s likely that data management isn’t entirely undervalued; it’s also likely that part of the problem is that, as environmental scientists, we may not currently have the “tools in our tool chest” to become good data managers. After all, we’re field scientists, not data scientists, right? I would argue that we need to be both.
The objective of this blog is to help improve data management in the environmental sciences. To this end, I’ll present an introduction to data management and a database schema model with the intent of moving us towards a more integrated approach to data management in the environmental sciences. The materials presented will be equivalent to that of a graduate level course in data management.