You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Catalog, API, and Linked Commons contributors are encouraged to contribute to our other Python projects such as the [CC Legal Database](https://github.com/creativecommons/legaldb) or the upcoming [CC Licenses](https://github.com/creativecommons/cc-licenses) project. If you are a CC Search contributor, we recommend checking out frontend projects such as the [CC Chooser](https://github.com/creativecommons/chooser) or [Vocabulary](https://github.com/creativecommons/vocabulary).
Copy file name to clipboardExpand all lines: content/blog/entries/cc-datacatalog-data-processing-2/contents.lr
+6-1
Original file line number
Diff line number
Diff line change
@@ -16,6 +16,11 @@ pub_date: 2019-07-26
16
16
---
17
17
body:
18
18
19
+
> ℹ️ **2023-08-31:** This project was archived along with the shuttering of CC
20
+
Search (now [Openverse](https://openverse.org/)). Please also see the
21
+
[Quantifying the Commons](https://github.com/creativecommons/quantifying)
22
+
project.
23
+
19
24
This is a continuation of my last blog post about the data processing part of the CC-data catalog visualization project. I recommend you to read that [last post](https://opensource.creativecommons.org/blog/entries/cc-datacatalog-data-processing/) for a better understanding of what I'll explain here.
20
25
21
26
@@ -68,7 +73,7 @@ The thresholds for the quantity of images and links are my intuitions from havin
68
73
- Visualization with the data.
69
74
- Development or modification of pruning/filtering rules.
70
75
71
-
You can follow the project development in the [Github repo](https://github.com/creativecommons/cccatalog-dataviz).
76
+
You can follow the project development in the [Github repo](https://github.com/cc-archive/cccatalog-dataviz).
72
77
73
78
CC Data Catalog Visualization is my GSoC 2019 project under the guidance of [Sophine
74
79
Clachar](https://creativecommons.org/author/sclachar/), who has been greatly helpful and considerate since the GSoC application period. Also, my backup mentor, [Breno Ferreira](https://creativecommons.org/author/brenoferreira/), and engineering director [Kriti
Copy file name to clipboardExpand all lines: content/blog/entries/cc-datacatalog-data-processing-3/contents.lr
+6-1
Original file line number
Diff line number
Diff line change
@@ -16,6 +16,11 @@ pub_date: 2019-08-12
16
16
---
17
17
body:
18
18
19
+
> ℹ️ **2023-08-31:** This project was archived along with the shuttering of CC
20
+
Search (now [Openverse](https://openverse.org/)). Please also see the
21
+
[Quantifying the Commons](https://github.com/creativecommons/quantifying)
22
+
project.
23
+
19
24
This is a continuation of my last blog post about the data processing part 2 of the CC-data catalog visualization project. I recommend you to read that [last post](https://opensource.creativecommons.org/blog/entries/cc-datacatalog-data-processing-2/) for a better understanding of what I'll explain here.
20
25
21
26
Hello! In this post I am going to talk you about the extraction of unique nodes, and links, and the visualization of the force-directed graph with the processed data.
@@ -56,7 +61,7 @@ The other tasks left to do are:
56
61
- Visualization of the pie chart
57
62
- Development or modification of pruning/filtering rules.
58
63
59
-
You can follow the project development in the [Github repo](https://github.com/creativecommons/cccatalog-dataviz).
64
+
You can follow the project development in the [Github repo](https://github.com/cc-archive/cccatalog-dataviz).
60
65
61
66
CC Data Catalog Visualization is my GSoC 2019 project under the guidance of [Sophine
62
67
Clachar](https://creativecommons.org/author/sclachar/), who has been greatly helpful and considerate since the GSoC application period. Also, my backup mentor, [Breno Ferreira](https://creativecommons.org/author/brenoferreira/), and engineering director [Kriti
Copy file name to clipboardExpand all lines: content/blog/entries/cc-datacatalog-data-processing/contents.lr
+6-1
Original file line number
Diff line number
Diff line change
@@ -16,6 +16,11 @@ pub_date: 2019-07-10
16
16
---
17
17
body:
18
18
19
+
> ℹ️ **2023-08-31:** This project was archived along with the shuttering of CC
20
+
Search (now [Openverse](https://openverse.org/)). Please also see the
21
+
[Quantifying the Commons](https://github.com/creativecommons/quantifying)
22
+
project.
23
+
19
24
Welcome to the data processing part of the GSoC project! In this blog post, I am going to tell you about my first thoughts with the real data, and give you some details of the implementation developed so far.
20
25
21
26
### Data Extraction
@@ -67,7 +72,7 @@ Another important aspect is the licenses types. In the dataset, we do not have t
67
72
- Data aggregation
68
73
- Visualization with the data + perfectioning pruning/filtering rules
69
74
70
-
You can follow the project development in the [Github repo](https://github.com/creativecommons/cccatalog-dataviz).
75
+
You can follow the project development in the [Github repo](https://github.com/cc-archive/cccatalog-dataviz).
71
76
72
77
CC Data Catalog Visualization is my GSoC 2019 project under the guidance of [Sophine
73
78
Clachar](https://creativecommons.org/author/sclachar/), who has been greatly helpful and considerate since the GSoC application period. Also, my backup mentor, [Breno Ferreira](https://creativecommons.org/author/brenoferreira/), and engineering director [Kriti
Copy file name to clipboardExpand all lines: content/blog/entries/cc-datacatalog-data-thelinkedcommons/contents.lr
+6-1
Original file line number
Diff line number
Diff line change
@@ -16,6 +16,11 @@ pub_date: 2019-09-03
16
16
---
17
17
body:
18
18
19
+
> ℹ️ **2023-08-31:** This project was archived along with the shuttering of CC
20
+
Search (now [Openverse](https://openverse.org/)). Please also see the
21
+
[Quantifying the Commons](https://github.com/creativecommons/quantifying)
22
+
project.
23
+
19
24
This is a continuation of my last blog post about the data processing part 3 of the CC-data catalog visualization project. I recommend you to read that [last post](https://opensource.creativecommons.org/blog/entries/cc-datacatalog-data-processing-3/) for a better understanding of what I'll explain here.
20
25
21
26
Hello! In this last post, I am going to talk you about the final visualization. First, I would like to talk about the data and share my recommendations.
@@ -87,7 +92,7 @@ Here is the final visualization, using a sample data from one month of the Commo
87
92
<br>
88
93
89
94
90
-
You can check the whole project source code in the [Github repo](https://github.com/creativecommons/cccatalog-dataviz/tree/master/GSoC2019).
95
+
You can check the whole project source code in the [Github repo](https://github.com/cc-archive/cccatalog-dataviz/tree/master/GSoC2019).
91
96
92
97
### Final comments and future work
93
98
This was my first experience with big data visualization, and I really enjoyed it!
Copy file name to clipboardExpand all lines: content/blog/entries/cc-datacatalog-visualization/contents.lr
+6-1
Original file line number
Diff line number
Diff line change
@@ -16,6 +16,11 @@ pub_date: 2019-06-17
16
16
---
17
17
body:
18
18
19
+
> ℹ️ **2023-08-31:** This project was archived along with the shuttering of CC
20
+
Search (now [Openverse](https://openverse.org/)). Please also see the
21
+
[Quantifying the Commons](https://github.com/creativecommons/quantifying)
22
+
project.
23
+
19
24
_“By visualizing information, we turn it into a landscape that you can explore with your eyes.”_ – David McCandless.
20
25
21
26
The landscape of licensed content is wide and varied. We have domains linking to other domains, different license types, and some metadata. This information is extracted from the Internet monthly by [Common Crawl](http://commoncrawl.org/). It is fair to mention that we have 250 million works and growing! If you didn't know we had so much licensed content, well then, this is one of the goals of this project: show the users how licensers are connected, their licensed content, and show how the licensing wave is expanding.
@@ -73,7 +78,7 @@ As the front-end is complete, I am going to get my hands dirty with the data. Fu
73
78
- Reviewing of part of the dataset
74
79
- Implementation of a module for cleaning and parsing the data
75
80
76
-
You can follow the project development in the [Github repo](https://github.com/creativecommons/cccatalog-dataviz).
81
+
You can follow the project development in the [Github repo](https://github.com/cc-archive/cccatalog-dataviz).
77
82
78
83
CC Data Catalog Visualization is my GSoC 2019 project under the guidance of [Sophine
79
84
Clachar](https://creativecommons.org/author/sclachar/), who has been greatly helpful and considerate since the GSoC application period. Also, my backup mentor, [Breno Ferreira](https://creativecommons.org/author/brenoferreira/), and engineering director [Kriti
Copy file name to clipboardExpand all lines: content/blog/entries/linked-commons-autocomplete-feature/contents.lr
+1-1
Original file line number
Diff line number
Diff line change
@@ -78,4 +78,4 @@ Here are some aggregated result statistics.
78
78
In the next blog, we will be covering the long awaited data update and the new architecture.
79
79
80
80
## Conclusion
81
-
Overall, I enjoyed working on this feature and it was a great learning experience. This feature has been successfully integrated to the development version, do check it out. Now that you have read this blog till the end, I hope that you enjoyed it. For more information please visit our [Github repo](https://github.com/creativecommons/cccatalog-dataviz/). We are looking forward to hearing from you on linked commons. Our [slack](https://creativecommons.slack.com/channels/cc-dev-cc-catalog-viz) doors are always open to you, see you there. :)
81
+
Overall, I enjoyed working on this feature and it was a great learning experience. This feature has been successfully integrated to the development version, do check it out. Now that you have read this blog till the end, I hope that you enjoyed it. For more information please visit our [Github repo](https://github.com/cc-archive/cccatalog-dataviz/). We are looking forward to hearing from you on linked commons. Our [slack](https://creativecommons.slack.com/channels/cc-dev-cc-catalog-viz) doors are always open to you, see you there. :)
Copy file name to clipboardExpand all lines: content/blog/entries/linked-commons-data-update/contents.lr
+1-1
Original file line number
Diff line number
Diff line change
@@ -83,7 +83,7 @@ This small change in the design simplified a lot of things, and now the new grap
83
83
84
84
## Conclusion
85
85
86
-
This task was really challenging and I learnt a lot. It was really mesmerizing to see the **Linked Commons grow and evolve**. I hope you enjoyed reading this blog. You can follow the project development [here](https://github.com/creativecommons/cccatalog-dataviz/), and access the stable version of linked commons [here](http://dataviz.creativecommons.engineering/).
86
+
This task was really challenging and I learnt a lot. It was really mesmerizing to see the **Linked Commons grow and evolve**. I hope you enjoyed reading this blog. You can follow the project development [here](https://github.com/cc-archive/cccatalog-dataviz/), and access the stable version of linked commons [here](http://dataviz.creativecommons.engineering/).
87
87
88
88
Feel free to report bugs and suggest features. It will help us improve this project. If you wish to join the our team, consider joining our [slack](https://creativecommons.slack.com/channels/cc-dev-cc-catalog-viz) channel. Read more about our community teams [here](https://opensource.creativecommons.org/community/). See you in my next blog! 🚀
- We realized that the client-side graph filtering method is not very scalable. This PR adds the basic structure for the backend server and adds server-side graph filtering logic.
92
92
- Added a parser to convert the input JSON file from `{nodes:[], links:[]}` schema to the distance list format.
93
93
@@ -100,7 +100,7 @@ Description: Returns a set of nodes which contains the {query} pattern in their
- Added query autocomplete feature, to enable users to explore all the nodes in the database.
116
116
- This functionality aims to minimize the number of misspelt filtering tries from the client.
117
117
- Refer to [this blog](/blog/entries/linked-commons-autocomplete-feature/) for the motivation and detailed report on why we added autocomplete aka node suggestions feature.
Copy file name to clipboardExpand all lines: content/blog/entries/linked-commons-whats-new/contents.lr
+1-1
Original file line number
Diff line number
Diff line change
@@ -72,4 +72,4 @@ In the next two weeks, I will be working on the following features.
72
72
* Update the visualization with a more recent and bigger dataset
73
73
74
74
## Conclusion
75
-
Overall, it was fantastic and rejuvenating experience working on these tasks. Now that you have read this blog till the end, I hope that you enjoyed it. For more information visit our [Github repo](https://github.com/creativecommons/cccatalog-dataviz/). We are looking forward to hearing from you on linked commons. Our [slack](https://creativecommons.slack.com/channels/cc-dev-cc-catalog-viz) doors are always open to you, see you there. :)
75
+
Overall, it was fantastic and rejuvenating experience working on these tasks. Now that you have read this blog till the end, I hope that you enjoyed it. For more information visit our [Github repo](https://github.com/cc-archive/cccatalog-dataviz/). We are looking forward to hearing from you on linked commons. Our [slack](https://creativecommons.slack.com/channels/cc-dev-cc-catalog-viz) doors are always open to you, see you there. :)
Copy file name to clipboardExpand all lines: content/blog/entries/meet-gsoc-2019-students/contents.lr
+1-1
Original file line number
Diff line number
Diff line change
@@ -94,7 +94,7 @@ summer. Here they are!
94
94
the most content, which CC licenses are used the most, and much more.
95
95
She will be mentored by our Data Engineer <a href="https://creativecommons.org/author/sclachar/">Sophine Clachar</a> with backup from Breno Ferreira.</p>
96
96
97
-
<p>You can follow the progress of this project through the <a href="https://github.com/creativecommons/cccatalog-dataviz">GitHub repo</a> or on the <code>#gsoc-cc-catalog-viz</code> channel on our <a href="https://creativecommons.github.io/community/">Slack community</a>.</p>
97
+
<p>You can follow the progress of this project through the <a href="https://github.com/cc-archive/cccatalog-dataviz">GitHub repo</a> or on the <code>#gsoc-cc-catalog-viz</code> channel on our <a href="https://creativecommons.github.io/community/">Slack community</a>.</p>
0 commit comments