Skip to content

Commit 271e7f2

Browse files
committed
Added project ideas for this round.
1 parent e7a49c1 commit 271e7f2

File tree

21 files changed

+451
-31
lines changed

21 files changed

+451
-31
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
_model: project-idea
2+
---
3+
_hidden: yes
4+
---
5+
title: Add Audio to the CC Catalog & CC Search
6+
---
7+
problem:
8+
Currently, [CC Search](https://search.creativecommons.org/) and the [CC Catalog API](https://api.creativecommons.engineering/) only support image search. We’d like to add more content types, especially audio. This would involve indexing audio sources in the CC Catalog, adding new endpoints to the CC Catalog API, and adding a UI to browse and search for audio in CC Search.
9+
---
10+
expected_outcome:
11+
* There would be scripts in the [CC Catalog](https://github.com/creativecommons/cccatalog) repository that index metadata related to openly licensed audio files and add it to our database.
12+
* The [CC Catalog API](https://api.creativecommons.engineering/) would have a set of endpoints that allowed searching for and browsing audio files.
13+
* If time permits, there would be a UI to browse and search for audio in [CC Search](https://search.creativecommons.org/).
14+
---
15+
internship_tasks:
16+
* Work with CC’s data engineer to define and implement a database schema for collecting audio file metadata.
17+
* Write scripts to ingest audio file metadata from open repositories such as [Freesound](https://freesound.org/), [Free Music Archive](https://freemusicarchive.org/), etc.
18+
* Implement additional API endpoints on the CC Catalog API to expose audio data.
19+
* Work with CC’s Director of Product and CC’s UX designer on design mockups for adding audio to CC Search.
20+
* Add components as needed to [Vocabulary](http://opensource.creativecommons.org/cc-vocabulary/) and [Vue Vocabulary](https://cc-vue-vocabulary.netlify.com/), in collaboration with CC’s UX designer
21+
* Implement the designs in CC Search using Vocabulary and Vue Vocabulary.
22+
---
23+
application_tips:
24+
* Include potential database schemas in your application.
25+
* Include sources that you might want to ingest in your application.
26+
---
27+
resources:
28+
* [CC Catalog code](https://github.com/creativecommons/cccatalog)
29+
* [CC Catalog API documentation](https://api.creativecommons.engineering/)
30+
* [CC Catalog API code](https://github.com/creativecommons/cccatalog-api)
31+
* [CC Search](https://search.creativecommons.org/)
32+
* [CC Search code](https://github.com/creativecommons/cccatalog-frontend)
33+
* [Vocabulary landing page](http://opensource.creativecommons.org/cc-vocabulary/)
34+
* [Vocabulary code](https://github.com/creativecommons/vocabulary)
35+
* [Vue Vocabulary landing page](https://cc-vue-vocabulary.netlify.com/)
36+
* [Vue Vocabulary code](https://github.com/creativecommons/vue-vocabulary)
37+
* Music sources: [Freesound](https://freesound.org/), [Free Music Archive](https://freemusicarchive.org/)
38+
* [Blog about Freesound on CC Open Source blog](https://opensource.creativecommons.org/blog/entries/freesound-intro/)
39+
---
40+
skills_recommended: CSS, Django, Django REST Framework, HTML, JavaScript, Python, Vue.js
41+
---
42+
mentors: Alden Page, Brent Moran, Anna Tumadóttir
43+
---
44+
difficulty: Medium
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
_model: project-idea
2+
---
3+
_hidden: yes
4+
---
5+
title: Add Provider API Scripts to CC Catalog
6+
---
7+
problem:
8+
The [CC Catalog](https://github.com/creativecommons/cccatalog) gets a huge amount of its data by pulling image info from APIs via what we are calling ‘Provider API Scripts’. We have a [backlog](https://github.com/orgs/creativecommons/projects/12) of providers which have been vetted, and we’d like to have scripts that pull data from their public APIs and pass it to our storage class. This would increase the breadth of material available from [CC Search](https://search.creativecommons.org/) and the [CC Catalog API](https://api.creativecommons.engineering/).
9+
---
10+
expected_outcome:
11+
We would like to have a number of completed, well-tested Provider API Scripts written by the end of this internship, and they should be deployed in production. Deployment in production implies we’d also have Apache Airflow DAGs (Directed Acyclic Graphs) that run the Provider API Scripts on an appropriate schedule.
12+
---
13+
internship_tasks:
14+
The intern should write more Provider API Scripts, taking priorities from the backlog linked above. Such a script must pull image information from a public API provided by the provider, and pass it along to a function that will validate the information, format it as necessary, and write it to disk. This validation/storage function is already written, so the intern needs only to write a script that knows how to get the relevant data from the public API of the provider. For examples of what we are expecting from a Provider API Script, see `wikimedia_commons.py` and `flickr.py `in [the repository](https://github.com/creativecommons/cccatalog/tree/master/src/cc_catalog_airflow/dags/provider_api_scripts), as well as their accompanying tests.
15+
---
16+
application_tips:
17+
Knowledge or experience with pulling real data from public APIs in JSON format would be helpful. It would also help if the intern is familiar with Python.
18+
---
19+
resources:
20+
* [CC Catalog code](https://github.com/creativecommons/cccatalog)
21+
* [Queue of providers waiting to be added](https://github.com/orgs/creativecommons/projects/12)
22+
* [Example scripts](https://github.com/creativecommons/cccatalog/tree/master/src/cc_catalog_airflow/dags/provider_api_scripts)
23+
---
24+
skills_recommended: Python, JSON, Apache Airflow (optional)
25+
---
26+
mentors: Brent Moran, Kriti Godey
27+
---
28+
difficulty: Low
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
_model: project-idea
2+
---
3+
_hidden: yes
4+
---
5+
title: Improve CC Search Accessibility
6+
---
7+
problem:
8+
Creative Commons is a global community, and yet, [CC Search](https://search.creativecommons.org/) lacks some accessibility features, including not being very user-friendly to users of screen readers. It's also available only in English, which results in the tool being less accessible to international audiences and reduces our reach. As an open web tool and platform, CC Search should be accessible to the widest audience possible, in as many languages as possible.
9+
---
10+
expected_outcome:
11+
A release of CC Search which contains:
12+
1. Improvements to the HTML of CC Search with regards to accessibility features, including Aria attributes, forms, color contrast ratios, and UI changes that improve usage of CC Search for users with some kind of disability.
13+
2. The implementation of i18n using [vue-i18n](https://kazupon.github.io/vue-i18n/), with the currently hardcoded text refactored into the compatible translation resources, localized numbers and dates, and locale detection so that the appropriate language is loaded when the user visits the CC Search website, and build a tool to easily integrate new translations with Transifex.
14+
---
15+
internship_tasks:
16+
* Research and implement accessibility improvements
17+
* Perform usability tests with people with disabilities to identify problems and test solutions to accessibility issues
18+
* Setup `vue-i18n` in CC Search
19+
* Refactor hardcoded English text into translation resource files
20+
* Localize numbers and dates
21+
* Detect user ideal language and load the appropriate locale data
22+
* Allow users to change the current locale
23+
* Integrate with Transifex
24+
---
25+
application_tips:
26+
* Good understanding of how to change and refactor code that is under active development
27+
* A plan that's broken into small enough tasks that can be done ideally in one week, and not contain big tasks that can take multiple weeks to complete
28+
* Bonus points for good and innovative ideas on how to integrate the actual translations
29+
* Bonus points for doing research into currently existing accessibility issues on CC Search
30+
---
31+
resources:
32+
* [CC Search](https://search.creativecommons.org/)
33+
* [CC Search code](https://github.com/creativecommons/cccatalog-frontend)
34+
* [GitHub issue for i18n](https://github.com/creativecommons/cccatalog-frontend/issues/487)
35+
* [How to add Internationalization to a Vue Application](https://www.freecodecamp.org/news/how-to-add-internationalization-to-a-vue-application-d9cfdcabb03b/)
36+
* [Accessibility at Berkeley](https://webaccess.berkeley.edu/home)
37+
* [W3C Introduction to Web Accessibility](https://www.w3.org/WAI/fundamentals/accessibility-intro/)
38+
* [Web Fundamentals: Accessibility](https://developers.google.com/web/fundamentals/accessibility)
39+
* [Accessibility | MDN Web Docs](https://developer.mozilla.org/en-US/docs/Learn/Accessibility)
40+
41+
---
42+
skills_recommended: CSS, HTML, JavaScript, Vue.js
43+
---
44+
mentors: Breno Ferreira, Kriti Godey
45+
---
46+
difficulty: Medium
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
_model: project-ideas-collection
2+
---
3+
_hidden: yes
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
_model: project-idea
2+
---
3+
_hidden: yes
4+
---
5+
title: Improvements to the CC WordPress Plugin
6+
---
7+
problem:
8+
CC maintains [a collection of case law and legal scholarship](https://labs.creativecommons.org/caselaw/) relevant to legal issues around Creative Commons licenses. Users can submit information, which is reviewed by a member of CC’s legal team and edited if necessary before publishing it to the live site. This tool currently has a number of issues:
9+
* publishing new data is a cumbersome manual process for both the legal and tech teams.
10+
* there is no way to browse all resources on the site.
11+
* It does not use [Vocabulary](opensource.creativecommons.org/cc-vocabulary/), CC’s new web design system.
12+
---
13+
expected_outcome:
14+
A new website, built using either WordPress or Django, that supports the following features:
15+
* The general public can submit legal information without a user account (the information collected should be identical to the current implementation).
16+
* The legal team at CC can review and edit incoming information and approve it, at which point it goes live.
17+
* All the legal information on the site should be browseable and searchable by keyword and country.
18+
* The user-facing portion of the website should use [Vocabulary](http://opensource.creativecommons.org/cc-vocabulary/) components.
19+
---
20+
internship_tasks:
21+
* Architect the backend of the legal database in either WordPress or Django, including researching appropriate plugins or libraries that will make the task easier.
22+
* Code the backend of the legal database.
23+
* Create design mockups for the frontend in collaboration with CC’s UX designer.
24+
* Implement new components to Vocabulary, if necessary, in collaboration with CC’s UX designer.
25+
* Code the frontend of the legal database using Vocabulary.
26+
* Assist CC staff with deployment related tasks, if needed.
27+
---
28+
application_tips:
29+
Please specify the architecture and plugins/libraries that you’d like to use in your application. We don’t want you to reinvent the wheel; we’d like to use existing libraries as much as possible.
30+
---
31+
resources:
32+
* [Creative Commons Legal Database (beta)](https://labs.creativecommons.org/caselaw/)
33+
* [Legal database GitHub repository](https://github.com/creativecommons/caselaw/)
34+
* [Vocabulary](opensource.creativecommons.org/cc-vocabulary/)
35+
---
36+
skills_recommended: CSS, Django, HTML, JavaScript, PHP, Python, WordPress (either Django/Python or WordPress/PHP, not both)
37+
---
38+
mentors: Kriti Godey, Sarah Pearson, Timid Robot Zehta
39+
---
40+
difficulty: Medium
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
_model: project-idea
2+
---
3+
_hidden: yes
4+
---
5+
title: Add filtering by node to the Linked Commons
6+
---
7+
problem:
8+
For last year’s GSOC, [María Belén Guaranda Cabezas](https://opensource.creativecommons.org/blog/authors/soccerdroid/) created a [visualization graph of domains](http://dataviz.creativecommons.engineering/) in the Commons, using one month of data from Common Crawl. For details, please see her posts on the CC Open Source Blog, which you can find at [her author page](https://opensource.creativecommons.org/blog/authors/soccerdroid/). We’d like to expand on the current state of that project by adding the possibility to filter by domain, and show only nodes within distance 2 (i.e., they can be reached by traveling along no more than two edges from the chosen node). If that is accomplished in short order, we’d also like to give the user the ability to choose the distance of nodes to show from a given node.
9+
---
10+
expected_outcome:
11+
* There should be a search box into which the user can type a domain (or part of a domain), and the graph should then show only nodes from that domain, and domains which can be reached with no more than two hops from the original.
12+
* Stretch Goal: It’d be great if the user could choose a distance from the chosen node via some sort of drop-down menu.
13+
* If both of those go well, we’d like to explore some graph-theoretic metrics on the graph.
14+
* We may also add live updating to the data that backs the visualization.
15+
* If these features are completed ahead of schedule, the intern may suggest further features to add to the visualization.
16+
---
17+
internship_tasks:
18+
The intern should implement the first feature above, and if there’s time, implement the second. It may be useful for the intern to assist with setting up live-updating of the data backing the visualization.
19+
---
20+
application_tips:
21+
Interest in and/or experience with graph theory would be useful!
22+
---
23+
resources:
24+
* [The Linked Commons](http://dataviz.creativecommons.engineering/)
25+
* [The Linked Commons code](https://github.com/creativecommons/cccatalog-dataviz)
26+
* [Introducing The Linked Commons](https://creativecommons.org/2020/01/23/introducing-the-linked-commons/) blog post on CC main blog
27+
* [Maria's technical blog posts](https://opensource.creativecommons.org/blog/authors/soccerdroid/)
28+
* [force-graph library](https://github.com/vasturiano/force-graph)
29+
---
30+
skills_recommended: JavaScript, Graph Theory
31+
---
32+
mentors: Brent Moran, María Belén Guaranda Cabezas
33+
---
34+
difficulty: Low
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
_model: project-idea
2+
---
3+
_hidden: yes
4+
---
5+
title: Usage & Reuse Metrics Dashboard for CC Search
6+
---
7+
problem:
8+
[CC Search](https://search.creativecommons.org/), which is CC’s search engine making openly licensed content discoverable, has hundreds of thousands of visitors per month. The catalog powering CC Search contains collections ranging from user-generated content at Flickr to priceless art pieces at the Met, and everywhere in between. The catalog contains a couple of dozen sources at present and is set for significant expansion.
9+
10+
While we do keep track of certain user actions in our database, and are able to glean other insights with the use of Google Analytics, we do not have a single dashboard to see relevant information.
11+
12+
This is a problem for two reasons:
13+
* We have a lack of comprehensive understanding of user behavior, as we have to look in multiple places and pull disparate information to paint a picture of overall user behavior.
14+
* Our catalog partners have no way to understand what impact their presence in CC Search is having in terms of discoverability and increased exposure to their collections.
15+
* In turn we are unable to tell a story about that impact to potential, future partners.
16+
---
17+
expected_outcome:
18+
We’d like the outcome to be an analytics interface, of some kind, where all of these metrics are pulled together into one place. The interface should be both aesthetically pleasing and clear to understand, allowing those accessing it to get the pertinent information they need, whether it is by day, week, month, collection, or any other logical grouping.
19+
---
20+
internship_tasks:
21+
* Understand the required data to be displayed as defined by CC HQ
22+
* Research analytics dashboards to provide recommendations on additional data that is valuable
23+
* Document a cohesive plan for structuring an analytics database storing all pertinent data
24+
* Mockup (wireframes only) in what way data could usefully be presented and what (if any) navigational elements will be required
25+
* Research existing analytics UIs
26+
* Present findings and make a recommendation on which one to use
27+
* Implement an analytics UI for CC Search Usage & Reuse data
28+
---
29+
application_tips:
30+
This is a complicated problem, but we believe there is an elegant solution. We’d like the intern to show that they understand the stakeholders, can argue for the value of being able to provide this data internally and externally, have a clear picture of how one would structure an analytics database (which could use example data like page views, unique visitors, time on page, bounce rate, by URL, by day, by month, etc.) with a view to flexibility for expansion and rapid querying, and an indication of their comfort level with both design and programming. The most important thing is solid development skills. Support for frontend design, based on proposed wireframes, can be provided by the organization.
31+
---
32+
resources:
33+
* [CC Search](https://search.creativecommons.org/)
34+
* [Custom analytics API code](https://github.com/creativecommons/cccatalog-api/tree/master/analytics)
35+
* [Initial analytics requirements and discussion](https://github.com/creativecommons/cccatalog-frontend/issues/426)
36+
* [Google Analytics](https://analytics.google.com/analytics/web/)
37+
---
38+
skills_recommended: CSS, design, HTML, Python, PostgreSQL
39+
---
40+
mentors: Alden Page, Anna Tumadóttir
41+
---
42+
difficulty: High
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
_model: project-idea
2+
---
3+
_hidden: yes
4+
---
5+
title: Integrate Vocabulary with CC Open Source & CCGN websites
6+
---
7+
problem:
8+
Creative Commons has many different websites ([CC.org](https://creativecommons.org/), [CC Global Summit](https://summit.creativecommons.org/), [CC Open Source](https://opensource.creativecommons.org/), [CC Certificates](https://certificates.creativecommons.org/), [CC Global Network](https://network.creativecommons.org/), [CC Chapter Sites](https://de.creativecommons.net/start/), etc.), all of which have different design elements and styles. One of our 2020 goals is to unify them all using our new web design system, [Vocabulary](https://cc-vocabulary.netlify.com/). We need help updating the CC Open Source and CCGN websites.
9+
---
10+
expected_outcome:
11+
* Updates to the CC Open Source website replacing all styling with components from Vocabulary.
12+
* Updates to the CC Global Network WordPress theme that build upon our base WordPress theme (this theme is currently in progress) and use components from Vocabulary.
13+
* Updates to our Figma design library (in collaboration with our UX Designer) and Vocabulary itself for any new components that need to be added when redesigning the sites.
14+
---
15+
internship_tasks:
16+
* Create wireframes for the website redesigns using Vocabulary components in Figma
17+
* Identify new components that need to be added to Vocabulary and work with the UX designer to design and implement them
18+
* Update the WordPress theme for the CC Global Network website to use Vocabulary exclusively
19+
* Update the CC Open Source website styling to use Vocabulary exclusively
20+
* Implement the new components in Vue Vocabulary if time permits
21+
---
22+
application_tips:
23+
It is okay if you think you’ll only have time to do one of the two websites. We’d rather you do one of them well than rush.
24+
---
25+
resources:
26+
* [vocabulary GitHub repo](https://github.com/creativecommons/vocabulary)
27+
* [vue-vocabulary GitHub repo](https://github.com/creativecommons/vue-vocabulary)
28+
* [Introducing Vocabulary blog post on CC blog](https://creativecommons.org/2019/12/13/cc-vocabulary-web-design-system/)
29+
* [Posts about Vocabulary on the CC open source blog](https://opensource.creativecommons.org/blog/categories/cc-vocabulary/)
30+
* [CC Open Source](https://opensource.creativecommons.org/)
31+
* [CC Global Network](https://network.creativecommons.org/)
32+
* [CC Base WordPress Theme](https://github.com/creativecommons/wp-theme-base)
33+
---
34+
skills_recommended: CSS, HTML, JavaScript, PHP, WordPress
35+
---
36+
mentors: Dhruv Bhanushali, Hugo Solar
37+
---
38+
difficulty: Medium

0 commit comments

Comments
 (0)