|
@@ -4,7 +4,7 @@ This repository downloads the Andrew dataset on global CO2 emissions from cement
|
|
|
|
|
|
## Description
|
|
|
|
|
|
-This repository downloads datasets on global CO2 emissions from cement production from [Zenodo](https://zenodo.org/records/10008931).
|
|
|
+This repository downloads data on global CO2 emissions from cement production from [Zenodo](https://zenodo.org/records/10008931).
|
|
|
The downloaded dataset can then be converted into CSV (.csv file extension) or NetCDF (.nc file extension) format.
|
|
|
The data management tool [DataLad](http://docs.datalad.org/en/stable/) is used to version control the data sets.
|
|
|
Commands to run the scripts are executed via the pydoit package.
|
|
@@ -12,12 +12,9 @@ Commands to run the scripts are executed via the pydoit package.
|
|
|
### Installation
|
|
|
|
|
|
- Install datalad according to the [DataLad handbook](https://handbook.datalad.org/en/latest/intro/installation.html). It is recommended to install globally.
|
|
|
-- DataLad is based on Git. You need to [install Git](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git) to run DataLad. If you are already a Git user you can skip this step.
|
|
|
-- You need to have [Python](https://www.python.org) installed on your computer.
|
|
|
-- [pydoit](https://pydoit.org/install.html) can be installed with
|
|
|
-```
|
|
|
-pip install doit
|
|
|
-```
|
|
|
+- DataLad is based on Git. [Git](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git) needs to be installed to run DataLad.
|
|
|
+- Install [Python](https://www.python.org)
|
|
|
+- [pydoit](https://pydoit.org/install.html)
|
|
|
|
|
|
## Getting Started
|
|
|
|
|
@@ -36,7 +33,7 @@ original and extracted files with the following command.
|
|
|
```
|
|
|
dataland get <filename>
|
|
|
```
|
|
|
-For example, the extracted data set for the 2023/09/13 release can be downloaded with:
|
|
|
+For example, the CSV file for the 2023/09/13 release can be downloaded with:
|
|
|
```
|
|
|
datalad get extracted_data/v230913/Robbie_Andrew_Cement_Production_CO2_230913.csv
|
|
|
```
|
|
@@ -62,9 +59,9 @@ doit read_version --version <YYMMDD>
|
|
|
## <a name="newversion"></a> How to add a new version
|
|
|
|
|
|
|
|
|
-To add a new version go to **versions.py** in the **src** directory and create a new entry in the
|
|
|
+1. To add a new version go to **versions.py** in the **src** directory and create a new value in the
|
|
|
dictionary. Fill all the required information similar to the previous entries.
|
|
|
-For example, the code added for the 13-Sep-2023 release looks like this:
|
|
|
+For example, the value _v230913_ in the _versions_ dictionary describes the 13-Sep-2023 release.
|
|
|
````python
|
|
|
versions = {
|
|
|
"v230913": {
|
|
@@ -89,7 +86,7 @@ versions = {
|
|
|
}
|
|
|
````
|
|
|
|
|
|
-Then run the two commands as described in [3.2]() and [3.3]().
|
|
|
+2. Then run the two commands as described in [3.2] and [3.3].
|
|
|
|
|
|
## Help
|
|
|
Show all doit commands
|
|
@@ -110,21 +107,21 @@ doit help <command>
|
|
|
|
|
|
### For developers
|
|
|
#### Repository structure
|
|
|
-- /**.datalad** contains config file for datalad
|
|
|
-- /**downloaded_data** contains original data from Zenodo.
|
|
|
-- /**extracted_data** contains data in .csv and .nc format
|
|
|
-- /**literature** contains link to publication by Robbie M. Andrew. You need to get it with datalad get
|
|
|
-- **src/** adjust this for all
|
|
|
- - **download_version.py** downloads files from zenodo for a given version. The version to read will be taken from the command line using argparse.
|
|
|
- - **download_version_datalad.py** calls datalad API to run the data reading function.
|
|
|
+- **.datalad/** contains config file for datalad
|
|
|
+- **downloaded_data/** contains original data from Zenodo.
|
|
|
+- **extracted_data/** contains data in .csv and .nc format
|
|
|
+- **literature/** contains link to publication by Robbie M. Andrew. Can be downloaded with _datalad get_ command
|
|
|
+- **src/**
|
|
|
+ - **download_version.py** downloads files from zenodo for a given version. The version to read will be taken from the command line using _argparse_.
|
|
|
+ - **download_version_datalad.py** calls datalad to run the data reading function.
|
|
|
- **helper_functions.py** contains a function to map country codes.
|
|
|
- - **read_version.py** reads the data for a given version and saves to primap2 native and
|
|
|
+ - **read_version.py** reads the data for a given version and saves to [PRIMAP2](https://primap2.readthedocs.io/en/stable/) native and
|
|
|
interchange format.
|
|
|
- **read_version_datalad.py** calls datalad to run the data reading function.
|
|
|
- **version.py** is a dictionary that contains metadata for each release. This file should be updated when [adding a new version](#a-namenewversiona-how-to-add-a-new-version)
|
|
|
- **dodo.py** defines pydoit commands.
|
|
|
-- **pyproject.toml**
|
|
|
-- **requirements.txt** dependencies for virtual environment
|
|
|
+- **pyproject.toml** configuration file
|
|
|
+- **requirements.txt** requirements
|
|
|
- **requirements_dev.txt** development requirements
|
|
|
-- **setup.cfg** all the requirments are actually here
|
|
|
-- **setup.py** to install python packages
|
|
|
+- **setup.cfg** requirements
|
|
|
+- **setup.py** installs python packages
|