tushar912
diff --git a/‎.github/CODEOWNERS
+1-1 b/‎.github/CODEOWNERS
+1-1
diff --git a/‎content/archives/contents.lr
+1-1 b/‎content/archives/contents.lr
+1-1
diff --git a/‎content/blog/authors/AyanChoudhary/contents.lr
+10 b/‎content/blog/authors/AyanChoudhary/contents.lr
+10
diff --git a/‎content/blog/authors/ahmadbilaldev/contents.lr
+1-1 b/‎content/blog/authors/ahmadbilaldev/contents.lr
+1-1
diff --git a/‎content/blog/authors/conye/contents.lr
+1-1 b/‎content/blog/authors/conye/contents.lr
+1-1
diff --git a/‎content/blog/authors/dhruvkb/contents.lr
+1-1 b/‎content/blog/authors/dhruvkb/contents.lr
+1-1
diff --git a/‎content/blog/authors/obulat/contents.lr
+1-1 b/‎content/blog/authors/obulat/contents.lr
+1-1
diff --git a/‎content/blog/authors/soccerdroid/contents.lr
+2-2 b/‎content/blog/authors/soccerdroid/contents.lr
+2-2
diff --git a/‎content/blog/authors/subhamX/contents.lr
+8 b/‎content/blog/authors/subhamX/contents.lr
+8
diff --git a/‎content/blog/categories/cc-dataviz/contents.lr
+1 b/‎content/blog/categories/cc-dataviz/contents.lr
+1
diff --git a/‎content/blog/entries/cc-search-accessibility-and-internationalization/contents.lr
+35 b/‎content/blog/entries/cc-search-accessibility-and-internationalization/contents.lr
+35
diff --git a/‎content/blog/entries/cc-search-accessibility-week1-2/audit.png
122 KB b/‎content/blog/entries/cc-search-accessibility-week1-2/audit.png
122 KB
diff --git a/‎content/blog/entries/cc-search-accessibility-week1-2/contents.lr
+71 b/‎content/blog/entries/cc-search-accessibility-week1-2/contents.lr
+71
diff --git a/‎content/blog/entries/cc-search-accessibility-week1-2/final.png
78.7 KB b/‎content/blog/entries/cc-search-accessibility-week1-2/final.png
78.7 KB
diff --git a/‎content/blog/entries/cc-vocabulary-the-main-course/contents.lr
+1-1 b/‎content/blog/entries/cc-vocabulary-the-main-course/contents.lr
+1-1
diff --git a/‎content/blog/entries/cc-vocabulary-week9-13/contents.lr
+1-1 b/‎content/blog/entries/cc-vocabulary-week9-13/contents.lr
+1-1
diff --git a/‎content/blog/entries/data-flow-API-to-DB/contents.lr
+105 b/‎content/blog/entries/data-flow-API-to-DB/contents.lr
+105
diff --git a/‎content/blog/entries/data-flow-API-to-DB/loader_workflow.png
57.1 KB b/‎content/blog/entries/data-flow-API-to-DB/loader_workflow.png
57.1 KB
@@ -1,7 +1,7 @@
 # These owners will be the default owners for everything in
 # the repo. Unless a later match takes precedence, they will 
 # be requested for review when someone opens a pull request.
-* @creativecommons/engineering @creativecommons/ct-cc-open-source-core-committers 
+* @creativecommons/engineering @creativecommons/ct-cc-open-source-core-committers @creativecommons/ct-cc-open-source-collaborators
 
 # These users own any files in the specified directory and 
 # any of its subdirectories.
 
@@ -6,4 +6,4 @@ body:
 
 This section contains archives related to older CC projects.
 
-* [CC Tech Blog (2007-2014)](/archives/old-tech-blog/entries)
+* [CC Tech Blog (2007-2014)](/archives/old-tech-blog/entries/)
@@ -0,0 +1,10 @@
+username: AyanChoudhary
+---
+name: Ayan Choudhary
+---
+md5_hashed_email: ba5f8ac4afb162644051544e25b5cfe8
+---
+about:
+Ayan Choudhary is an Electrical Engineering undergraduate student from India and will be interning with Creative Commons during the summer. He has been involved with coding quite heavily for the past couple of years which is one of his numerous hobbies. Some of the sectors which really fascinate him include network security, blockchain, and data science. Apart from this he loves reading and painting and is quite interested in PC gaming and binge-watching online shows.
+He is working on [ccsearch](https://github.com/creativecommons/cccatalog-frontend) as a part of GSoC20.
+He is `@ayan` on slack.
@@ -7,4 +7,4 @@ md5_hashed_email: 870502bc55d77d77522ad3f27876b511
 about:
 Ahmad Bilal is a Computer Science undergrad from UET Lahore, who likes computers, problems and using the former to solve the later. He is always excited about Open Source, and is currently focused on Node.js, Serverless, GraphQL, Cloud, Gatsby.js with React.js and WordPress. He likes organizing meetups, conferences and meeting new people. Cats are his weakness, and he is a sucker for well-engineered cars.
 
-Ahmad is working on [the CC WordPress plugin](https://github.com/creativecommons/creativecommons-wordpress-plugin) as part of [Google Summer of Code 2019](/gsoc-2019).
+Ahmad is working on [the CC WordPress plugin](https://github.com/creativecommons/creativecommons-wordpress-plugin) as part of [Google Summer of Code 2019](/gsoc-2019/).
@@ -7,4 +7,4 @@ md5_hashed_email: 9088efad6d512ef79556a3b6adcf048f
 about:
 Chidiebere Onyegbuchulem is a Frontend developer based in Lagos, Nigeria.
 
-Chidi is currently working on [CC Vocabulary](https://github.com/creativecommons/cc-vocabulary) as part of 2019-2020 [Outreachy Internship](/programs/outreachy/2019-12-start).
+Chidi is currently working on [CC Vocabulary](https://github.com/creativecommons/cc-vocabulary) as part of 2019-2020 [Outreachy Internship](/programs/outreachy/2019-12-start/).
@@ -7,4 +7,4 @@ md5_hashed_email: 0eab64adad056cff2492e7c407a9aa21
 about:
 Dhruv Bhanushali is a Mumbai-based software developer and an Engineering-Physics graduate from IIT Roorkee. He started programming as a hobby in high-school and having found his calling, is now pursuing a career in the field. He is a huge fan of alternative and post-rock music and keeps his curated collection with him at all times. He also loves to binge watch TV shows and movies, especially indie art films.
 
-Dhruv developed [CC Vocabulary](https://opensource.creativecommons.org/cc-vocabulary/) as part of [Google Summer of Code 2019](/gsoc-2019) and now is a maintainer for the project. He is consistently [`@dhruvkb`](https://dhruvkb.github.io/) everywhere.
+Dhruv developed [CC Vocabulary](https://opensource.creativecommons.org/cc-vocabulary/) as part of [Google Summer of Code 2019](/gsoc-2019/) and now is a maintainer for the project. He is consistently [`@dhruvkb`](https://dhruvkb.github.io/) everywhere.
@@ -7,4 +7,4 @@ md5_hashed_email: acd34b5434369aeaf31de8ea94368bf0
 about:
 [Olga](https://creativecommons.org/author/obulat/) is a developer based in Istanbul, Turkey. She loves programming in Python and Javascript. Her main areas of interest are web development, Natural Language Processing, languages, geography and education. Apart from that, she is busy raising her (soon to be) three kids.
 
-Olga is currently working on improving [the CC License Chooser](https://github.com/creativecommons/cc-chooser) as part of 2019-2020 [Outreachy Internship](/programs/outreachy).
+Olga is currently working on improving [the CC License Chooser](https://github.com/creativecommons/cc-chooser) as part of 2019-2020 [Outreachy Internship](/programs/outreachy/).
@@ -5,6 +5,6 @@ name: María Belén Guaranda Cabezas
 md5_hashed_email: a177edcce952c2c82ac8716a4586a28f
 ---
 about:
-Maria is an undergraduate Computer Science student from ESPOL, in Ecuador. She has worked for the past 2 years as a research assistant. She has worked in projects including computer vision, the estimation of socio-economic indexes through CDRs analysis, and a machine learning model with sensors data. During her spare time, she likes to watch animes and read. She loves sports, especially soccer. She is also committed to environmental causes, and she is a huge fan of cats and dogs (she has 4 and 1 respectively).
+Maria is a Bachelor of Computer Science from Ecuador. As a research assistant, she worked in projects including computer vision, the estimation of socio-economic indexes through CDRs analysis, and a machine learning model with sensors data. During her spare time, she likes to watch animes and read. She loves sports, especially soccer. She is also committed to environmental causes, and she is a huge fan of cats and dogs (she has 4 and 1 respectively).
 
-Maria is working on [data visualizations of the CC Catalog](https://github.com/creativecommons/cccatalog-dataviz) as part of [Google Summer of Code 2019](/gsoc-2019).
+Maria worked in the [data visualizations of the CC Catalog](https://github.com/creativecommons/cccatalog-dataviz) as part of [Google Summer of Code 2019](/gsoc-2019/), and is currently a mentor in this year's edition of the program.
@@ -0,0 +1,8 @@
+username: subhamX
+---
+name: Subham Sahu
+---
+md5_hashed_email: 1ca2562f3046509e3273fe5afd3fdab2
+---
+about:
+Subham Sahu is an undergraduate student from Indian Institute Of Technology, Ropar. He is currently working on the [Linked Commons](https://github.com/creativecommons/cccatalog-dataviz) as part of [Google Summer of Code 2020](/gsoc-2020/).
@@ -0,0 +1 @@
+name: cc-dataviz
@@ -0,0 +1,35 @@
+title: CC Search, Proposal Drafting and Community Bonding
+---
+categories:
+cc-search
+community
+gsoc
+open-source
+
+---
+author: AyanChoudhary
+---
+series: gsoc-2020-ccsearch-accessibility
+---
+pub_date: 2020-05-22
+---
+body:
+
+### Proposal Drafting
+
+The majority of my time in March was spent on drafting the proposal for my project **Improve CC Search Accessibility**.
+While drafting my proposal I had two broad topic that I had to focus on: Accessibility and Internationalization.
+
+So the first thing which I did was go through the various resources available with me such as w3 guidelines for accessibility, dequeuniversity accessibility insights and MDN notes on accessibility.
+After I made myself acquainted myself wih all of these, the next challenge was to sort out which of the metrics were relevant and important enough to be detailed in the proposal and also some of the others metrics which made notable appearances.
+Finally by including all of these I had the accessibility part of my proposal complete. Next, I had to work out the part for internationalization. Since it was already decided upon that we will be using vue-i18n, I did some research as to how to we can leverage it to gain the best possible result.
+
+One of the important parts of internationalization happens to be deciding upon the JSON structure which was a highlighted section in my proposal.
+The other notable sections included strategies for modification of templates while translating and also how the translations would be carried out without hindering any further development of the platform.
+
+### Community Bonding
+
+Community Bonding involved getting to the mentors and the people whom I will be working with during this internship. Also we decided upon running the audit tests for the cc-search website during this time as it would help identify the key issues we would be facing and also would provide a suitable foundation to start working upon.
+The audits were done using Lighthouse, Accessibility Insights and pa11y and they provided useful insights on which parts of the website we should be focusing on such as the contrast issues and the aria-label fixes.
+
+Coming up next will be the progress on the first 2 weeks of the project.
@@ -0,0 +1,71 @@
+title: CC Search, Setting up vue-i18n and internationalizing homepage
+---
+categories:
+cc-search
+community
+gsoc
+open-source
+
+---
+author: AyanChoudhary
+---
+series: gsoc-2020-ccsearch-accessibility
+---
+pub_date: 2020-06-10
+---
+body:
+
+These are the first two weeks of my internship with CC. I am working on improving the accessibility of cc-search and internationalizing it as well.
+We started with first compiling the accessibility reports from accessibility insights, lighthouse and pa11y into a single document and then opening up appropriate issues ont he repo to address them.
+
+The accessibility issues are listed here:
+1. [Accessibility - Improve labels](https://github.com/creativecommons/cccatalog-frontend/issues/996)
+2. [Evaluate keyboard navigation effectiveness](https://github.com/creativecommons/cccatalog-frontend/issues/997)
+3. [Fix color contrast problems](https://github.com/creativecommons/cccatalog-frontend/issues/998)
+4. [Improve elements markup](https://github.com/creativecommons/cccatalog-frontend/issues/999)
+5. [Evaluate any accessibility linter tools](https://github.com/creativecommons/cccatalog-frontend/issues/1000)
+
+The decision was made to audit the tab indices along with internationalizing the page.
+The accessibility changes will be done after the completion of internationalization as the aria-labels will have to be internationalized as well.
+
+The first two weeks involved setting up vue-i18n, auditing the tab index for homepage and internationalizing it.
+The tab index adit for homepage is displayed:
+
+![audit.png](audit.png)
+
+The internationalization part was pretty straightforward, we just had to export all the strings to the JSON files and load transaltions through the i18n module.
+For complex elements of the type ```string <tag>string</tag> string``` I went for the templating method.
+Here we use the v-slot attribute of the i18n functional component to convert the element into a template where the tag occupies a slot in the syntax.
+
+```
+<i18n path="footer.caption.label" tag="p" class="caption">
+    <template v-slot:noted>
+        <a href="https://creativecommons.org/policies#license" target="_blank" rel="noopener">{{$t('footer.caption.noted')}}</a>
+    </template>
+    <template v-slot:attribution>
+        <a href="https://creativecommons.org/licenses/by/4.0/" target="_blank" rel="noopener">
+        {{$t('footer.caption.attribution')}}
+        </a>
+    </template>
+    <template v-slot:icons>
+        <a href="https://fontawesome.com/" target="_blank" rel="noopener" class="has-text-white">
+        {{$t('footer.caption.icons')}}
+        </a>
+    </template>
+</i18n>
+```
+
+The final outcome looks pretty good:
+
+![final.png](final.png)
+
+And voila we are done with the first two weeks. I also internationalized the header and the footer along with the homepage.
+You can track the work done for these weeks through these PRs:
+
+1. [setup internationalization plugin](https://github.com/creativecommons/cccatalog-frontend/pull/1007)
+2. [Internationalize homepage, header and footer](https://github.com/creativecommons/cccatalog-frontend/pull/1013)
+
+The progress of the project can be tracked on [cc-search](https://github.com/creativecommons/cccatalog-frontend)
+
+CC Search Accessiblity is my GSoC 2020 project under the guidance of [Ari Madian](https://opensource.creativecommons.org/blog/authors/akmadian/), who is the primary mentor for this project, [Anna Tumadóttir](https://creativecommons.org/author/annacreativecommons-org/) for helping all along and engineering director [Kriti
+Godey](https://creativecommons.org/author/kriticreativecommons-org/), have been very supportive.
@@ -85,7 +85,7 @@ brainchild comes of age.
 There are a number of components under construction right now such as cards and social media buttons. They will be
 published on the styleguide very soon. After these, the final month, phase III of of the project's GSoC term will be
 spent continuously polishing the project to suit the needs of all CC apps as discovered during the integration with
-CC Search as mentioned by [Breno Ferreira](/blog/authors/brenoferreira) in the 'Next steps' in his post on
+CC Search as mentioned by [Breno Ferreira](/blog/authors/brenoferreira/) in the 'Next steps' in his post on
 [CC Search Redesign](/blog/entries/cc-search-redesign/).
 
 In keeping with the culinary theme of this post, think of it as sweet sweet dessert.
 
@@ -54,4 +54,4 @@ Before this internship, I had just switched careers from Network engineering to
 
 I will continue to contribute to CC open source projects especially to the Vocabulary project that I have become a part of. I would love to see the application of Vocabulary to the development of other CC platforms and applications.  I also want to apply the skills that I have acquired to get a full-time software developer position.
 
-My special appreciation to Outreachy for this opportunity, the entire CC team especially those I worked with, My mentors [Hugo Solar](/blog/authors/hugosolar) and [Dhruv Bhanushali](/blog/authors/dhruvkb) for their guidance, direction, and help whenever I got stuck, also to the Director of Engineering [Kriti Godey](/blog/authors/kgodey) for always checking up on me ensuring I had a wonderful internship experience.
+My special appreciation to Outreachy for this opportunity, the entire CC team especially those I worked with, My mentors [Hugo Solar](/blog/authors/hugosolar/) and [Dhruv Bhanushali](/blog/authors/dhruvkb/) for their guidance, direction, and help whenever I got stuck, also to the Director of Engineering [Kriti Godey](/blog/authors/kgodey/) for always checking up on me ensuring I had a wonderful internship experience.
@@ -0,0 +1,105 @@
+title: Data flow: from API to DB
+---
+categories: 
+
+cc-catalog
+airflow
+gsoc
+gsoc-2020
+---
+author: srinidhi
+---
+series: gsoc-2020-cccatalog
+---
+pub_date: 2020-07-22
+--- 
+body:
+
+## Introduction 
+The CC Catalog project  handles the flow of image metadata from the source or
+provider and loads it to the database, which is then surfaced to the [CC
+search][CC_search] tool. The workflows are set up for each provider to gather
+metadata about CC licensed images. These workflows are handled with the help of
+Apache Airflow. Airflow is an open source tool that helps us to schedule and
+monitor workflows. 
+[CC_search]: https://ccsearch.creativecommons.org/about
+
+## Airflow intro
+Apache Airflow is an open source tool that helps us to schedule tasks and
+monitor workflows . It provides an easy to use UI that makes managing tasks
+easy.  In Airflow, the tasks we want to schedule are organised in DAGs
+(Directed Acyclic Graphs). DAGs consist of a collection of tasks, and a
+relationship defined among these tasks, so that they run in an organised
+manner. DAGs files are standard python files that are loaded from  the defined
+`DAG_FOLDER` on a host. Airflow selects all the python files in the
+`DAG_FOLDER` that have a DAG instance defined globally, and executes them to
+create the DAG objects.
+
+## CC Catalog Workflow
+In the CC catalog, Airflow is set up inside a docker container along with other
+services . The loader and provider workflows are inside the `dags` directory in
+the repo [dag folder][dags]. Provider workflows are set up to pull metadata
+about CC licensed images from the respective providers , the data pulled is
+structured into a standardised format and written into a TSV (Tab Separated
+Values) file locally. These TSV files are then loaded into S3 and then finally
+to PostgreSQL DB by the loader workflow.
+[dags]: https://github.com/creativecommons/cccatalog/tree/dacb48d24c6ae9b532ff108589b9326bde0d37a3/src/cc_catalog_airflow/dags
+
+## Provider API workflow
+The provider workflows are usually scheduled in one of two time frequencies,
+daily or monthly. 
+
+Providers such as Flickr or Wikimedia Commons that are filtered using the date
+parameter are usually scheduled for daily jobs. These providers have a large
+volume of continuously changing data, and so daily updates are required to keep
+the data in sync.
+
+Providers that are scheduled for monthly ingestion are ones with a relativley
+low volume of data, or for which filtering by date is not possible. This means
+we need to ingest the entire collection at once. Examples are museum providers
+like the [Science museum UK][science_museum] or [Statens Museum for
+Kunst][smk]. We don’t expect museum providers to change data on a daily basis. 
+
+[science_museum]: https://collection.sciencemuseumgroup.org.uk/
+[smk]: https://www.smk.dk/
+
+The scheduling of the DAGs by the scheduler daemons depends on a few
+parameters. 
+
+- ```start_date``` - it denotes the starting date from which the
+task should begin running. 
+- ```schedule_interval``` - it denotes the interval between subsequent runs, it
+can be specified with airflow keyword strings like “@daily”, “@weekly”,
+“@monthly”, “@yearly” other than these we can also schedule the interval using
+cron expression.
+
+
+Example: Cleveland museum is currently scheduled for a monthly crawl with a
+starting date as ```2020-01-15```. [cleveland_museum_workflow][clm_workflow]
+
+[clm_workflow]: https://github.com/creativecommons/cccatalog/blob/dacb48d24c6ae9b532ff108589b9326bde0d37a3/src/cc_catalog_airflow/dags/cleveland_museum_workflow.py
+
+## Loader workflow
+The data from the provider scripts are not directly loaded into S3. Instead,
+they are stored in a TSV file on the local disk, and the tsv_postgres workflow
+handles loading of data to S3, and eventually PostgreSQL. The DAG starts by
+calling the task to stage the oldest tsv file from the output directory of the
+provider scripts to the staging directory. Next, two tasks run in parallel, one
+loads the tsv file in the staging directory to S3 , while the other creates the
+loading table in the PostgreSQL database. Once the data is loaded to S3 and the
+loading table has been created, the data from S3 is loaded to the intermediate
+loading table and then finally inserted into the image table. If loading from
+S3 fails the data is loaded to PostgreSQL from the locally stored tsv file.
+When the data has been successfully transferred to the image table, the
+intermediate loading table is dropped and the tsv files in the staging
+directory are deleted. If the copying the tsv files to S3 fails or then those
+files are moved to the failure directory for future inspection.
+
+<div style="text-align:center;">
+    <img src="loader_workflow.png" width="1000px"/>
+    <p> Loader workflow </p>
+</div>
+
+## Acknowledgement
+
+I would like to thank Brent Moran for helping me write this blog post.
Original file line number	Diff line number	Diff line change
`@@ -6,4 +6,4 @@ body:`
`6`	`6`
`7`	`7`	`This section contains archives related to older CC projects.`
`8`	`8`
`9`		`-* [CC Tech Blog (2007-2014)](/archives/old-tech-blog/entries)`
	`9`	`+* [CC Tech Blog (2007-2014)](/archives/old-tech-blog/entries/)`
Original file line number	Diff line number	Diff line change
`@@ -54,4 +54,4 @@ Before this internship, I had just switched careers from Network engineering to`
`54`	`54`
`55`	`55`	`I will continue to contribute to CC open source projects especially to the Vocabulary project that I have become a part of. I would love to see the application of Vocabulary to the development of other CC platforms and applications. I also want to apply the skills that I have acquired to get a full-time software developer position.`
`56`	`56`
`57`		`-My special appreciation to Outreachy for this opportunity, the entire CC team especially those I worked with, My mentors [Hugo Solar](/blog/authors/hugosolar) and [Dhruv Bhanushali](/blog/authors/dhruvkb) for their guidance, direction, and help whenever I got stuck, also to the Director of Engineering [Kriti Godey](/blog/authors/kgodey) for always checking up on me ensuring I had a wonderful internship experience.`
	`57`	`+My special appreciation to Outreachy for this opportunity, the entire CC team especially those I worked with, My mentors [Hugo Solar](/blog/authors/hugosolar/) and [Dhruv Bhanushali](/blog/authors/dhruvkb/) for their guidance, direction, and help whenever I got stuck, also to the Director of Engineering [Kriti Godey](/blog/authors/kgodey/) for always checking up on me ensuring I had a wonderful internship experience.`