Sfoglia il codice sorgente

add first version of readme

Daniel Busch 1 anno fa
parent
commit
321b0b83d3
1 ha cambiato i file con 130 aggiunte e 0 eliminazioni
  1. 130 0
      README.md

+ 130 - 0
README.md

@@ -0,0 +1,130 @@
+# Global CO2 from Cement Production Dataset
+
+This repository downloads the Andrew dataset on global CO2 emissions from cement production from Zenodo.
+
+## Description
+
+This repository downloads datasets on global CO2 emissions from cement production from [Zenodo](https://zenodo.org/records/10008931).
+The downloaded dataset can then be converted into CSV (.csv file extension) or NetCDF (.nc file extension) format.
+The data management tool [DataLad](http://docs.datalad.org/en/stable/) is used to version control the data sets.
+Commands to run the scripts are executed via the pydoit package.
+
+### Installation
+
+- Install datalad according to the [DataLad handbook](https://handbook.datalad.org/en/latest/intro/installation.html). It is recommended to install globally. 
+- DataLad is based on Git. You need to [install Git](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git) to run DataLad. If you are already a Git user you can skip this step.
+- You need to have [Python](https://www.python.org) installed on your computer.
+- [pydoit](https://pydoit.org/install.html) can be installed with 
+```
+pip install doit
+```
+
+## Getting Started
+
+### 1. Clone the repository
+
+Download the repository using the following command.
+```
+datalad clone
+```
+Do not use **git clone** to download the repository! This way DataLad will not have the necessary
+information to run the program.
+
+### 2. Easy Access
+Users who simply want to download the dataset have the option to access both the
+original and extracted files with the following command.
+```
+dataland get <filename>
+```
+For example, the extracted data set for the 2023/09/13 release can be downloaded with:
+```
+datalad get extracted_data/v230913/Robbie_Andrew_Cement_Production_CO2_230913.csv 
+```
+
+
+### 3. Executing the program
+
+#### 3.1 Set up the virtual environment with doit
+```
+doit setup_env
+```
+#### <a name="download"></a> 3.2 Download the version from the command line.
+This will download all files from Zenodo as they are.
+```
+doit download_version --version <YYMMDD>
+```
+#### <a name="convert"></a> 3.3 Convert the data sets into CSV and NetCDF files.
+```
+doit read_version --version <YYMMDD>
+```
+
+
+## <a name="newversion"></a> How to add a new version
+
+
+To add a new version go to **versions.py** in the **src** directory and create a new entry in the
+dictionary. Fill all the required information similar to the previous entries.
+For example, the code added for the 13-Sep-2023 release looks like this:
+````python
+versions = {
+    "v230913": {
+        'date': '13-Sep-2023',
+        'ver_str_long': 'version 230913',
+        'ver_str_short': '230913',
+        "folder": "v230913",
+        "transpose": False,
+        "filename": "0. GCP-CEM.csv",
+        'ref': '10.5281/zenodo.8339353',
+        'ref2': '10.5194/essd-11-1675-2019',
+        'title': 'Global CO2 emissions from cement production',
+        'institution': "CICERO - Center for International Climate Research",
+        'filter_keep': {},
+        'filter_remove': {},
+        'contact': "johannes.guetschow@climate-resource.com",
+        'comment': ("Published by Robbie Andrew, converted to PRIMAP2 format by "
+                    "Johannes Gütschow"),
+        'unit': 'kt * CO2 / year',
+        'country_code': True,
+    },
+}
+````
+
+Then run the two commands as described in [3.2]() and [3.3]().
+
+## Help
+Show all doit commands
+```
+doit help
+```
+See a list with possible doit commands specific to this repository
+```
+doit list
+```
+
+Get help on a specific command
+
+```
+doit help <command>
+```
+
+
+### For developers
+#### Repository structure
+- /**.datalad** contains config file for datalad
+- /**downloaded_data** contains original data from Zenodo.
+- /**extracted_data** contains data in .csv and .nc format
+- /**literature** contains link to publication by Robbie M. Andrew. You need to get it with datalad get
+- **src/** adjust this for all
+  - **download_version.py** downloads files from zenodo for a given version. The version to read will be taken from the command line using argparse.
+  - **download_version_datalad.py** calls datalad API to run the data reading function.
+  - **helper_functions.py** contains a function to map country codes.
+  - **read_version.py** reads the data for a given version and saves to primap2 native and
+    interchange format.
+  - **read_version_datalad.py** calls datalad to run the data reading function.
+  - **version.py** is a dictionary that contains metadata for each release. This file should be updated when [adding a new version](#a-namenewversiona-how-to-add-a-new-version) 
+- **dodo.py** defines pydoit commands.
+- **pyproject.toml** 
+- **requirements.txt** dependencies for virtual environment
+- **requirements_dev.txt** development requirements
+- **setup.cfg** all the requirments are actually here
+- **setup.py** to install python packages