Export annotated citations in a format usable for Excel analysis

Our current annotated citations live in .bib files, and contain **multiple values for 2 keys** especially important for data analysis, which are `cc-class` and `keywords`.

Example:
```
@Misc{cc:KoenigRauchWoerter:2025:Monitoring-of-economic-shocks,
  title        = "Real-time Monitoring of Economic Shocks using Company Websites",
  author       = "Michael Koenig and Jakob Rauch and Martin Woerter",
  year         = "2025",
  ...
  primaryclass = "econ.GN",
  keywords     = "large language models, natural language processing, crisis, economic shocks, economic monitoring, Covid-19",
  URL          = "https://arxiv.org/abs/2502.17161",
  abstract     = ...
  cc-author-affiliation = "ETH Zurich, ...
  cc-class     = "economics, economic-monitoring, web-archiving, nlp/large-language-models",
  cc-snippet   = ...
}
```

**It is important to have 1 value per key in Excel to be able to use data in pivot tables and charts.** Proposed solution is to export these into csvs with multiple rows per paper, identified with unique ID per paper.

### Proposal: 
Add into `export-csv.py`: Functionality for producing one big csv from all of our citations, that:
- can be exported (or copy-pasted) as an Excel file,
- and has multiple rows per citations (one row for key-value pair) that enable using pivot tables and chart-making in Excel.

**Proposed format:** One new aggregate column (**cc-topic**) that combines values from `keyword` and `cc-class` columns, and another new column (**cc-og-key**). that holds which OG column (`keyword` or `cc-class`) this topic came from.

| id | year | primaryclass | cc-og-key | cc-topic | title | authors | cc-author-affiliation | URL | DOI | cc-snippet | 
| -- | ------ | ------------------- | ------------ |------- | ------ | ----------- | ----------------------------- | ----- | ------ | --------------- | 
| cc:Koenig... | 2025 | econ.GN | keyword | large language models | Real-time Monitoring ... | Koenig, ... | ETH ... | http://... | ... | .... |
| cc:Koenig... | 2025 | econ.GN | keyword | natural language processing | Real-time Monitoring ... | Koenig, ... | ETH ... | http://... | ... | .... |
| cc:Koenig... | 2025 | econ.GN | ...| ... | Real-time Monitoring ... | Koenig, ... | ETH ... | http://... | ... | .... |
| cc:Koenig... | 2025 | econ.GN | keyword | Covid-19 | Real-time Monitoring ... | Koenig, ... | ETH ... | http://... | ... | .... |
| cc:Koenig... | 2025 | econ.GN | cc-class | economics | Real-time Monitoring ... | Koenig, ... | ETH ... | http://... | ... | .... |
| cc:Koenig... | 2025 | econ.GN | cc-class | economic-monitoring | Real-time Monitoring ... | Koenig, ... | ETH ... | http://... | ... | .... |
| cc:Koenig... | 2025 | econ.GN | ...| ... | Real-time Monitoring ... | Koenig, ... | ETH ... | http://... | ... | .... |
| cc:Koenig... | 2025 | econ.GN | cc-class | web-archiving | Real-time Monitoring ... | Koenig, ... | ETH ... | http://... | ... | .... |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | .... |

- This should allow pivot tables and graphs, if necessary by eliminating all rows that include `OG key = keyword`, etc, or aggregating all keys (keyword, cc-class...) into the same analysis.
- It enables one row per value for multiple keys with multiple values (without needing an NxN mapping).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Export annotated citations in a format usable for Excel analysis #6

Proposal:

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

id	year	primaryclass	cc-og-key	cc-topic	title	authors	cc-author-affiliation	URL	DOI	cc-snippet
cc:Koenig...	2025	econ.GN	keyword	large language models	Real-time Monitoring ...	Koenig, ...	ETH ...	http://...	...	....
cc:Koenig...	2025	econ.GN	keyword	natural language processing	Real-time Monitoring ...	Koenig, ...	ETH ...	http://...	...	....
cc:Koenig...	2025	econ.GN	...	...	Real-time Monitoring ...	Koenig, ...	ETH ...	http://...	...	....
cc:Koenig...	2025	econ.GN	keyword	Covid-19	Real-time Monitoring ...	Koenig, ...	ETH ...	http://...	...	....
cc:Koenig...	2025	econ.GN	cc-class	economics	Real-time Monitoring ...	Koenig, ...	ETH ...	http://...	...	....
cc:Koenig...	2025	econ.GN	cc-class	economic-monitoring	Real-time Monitoring ...	Koenig, ...	ETH ...	http://...	...	....
cc:Koenig...	2025	econ.GN	...	...	Real-time Monitoring ...	Koenig, ...	ETH ...	http://...	...	....
cc:Koenig...	2025	econ.GN	cc-class	web-archiving	Real-time Monitoring ...	Koenig, ...	ETH ...	http://...	...	....
...	...	...	...	...	...	...	...	...	...	....

Export annotated citations in a format usable for Excel analysis #6

Description

Proposal:

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions