You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: cc-search/index.html
+30-25
Original file line number
Diff line number
Diff line change
@@ -173,26 +173,23 @@ <h4>Q2 2020</h4>
173
173
</tr>
174
174
175
175
<tr>
176
-
<thscope="row">Move data cleaning pipeline from API to Catalog</th>
177
-
<tdscope="row">Move our data cleaning code from the ingestion step of the API to the initial data processing step of the Catalog to eliminate unnecessary repetitive data cleaning.</td>
176
+
<thscope="row">Plan search algorithm changes for new metadata [AWS Grant]</th>
177
+
<tdscope="row">Plan out search algorithm changes to incorporate image metadata generated via AWS Rekognition.</td>
<tdscope="row">Improve data processing infrastructure in the Catalog by parallelizing loading and moving storage of data files from providers to S3.</td>
183
+
</tr>
192
184
193
185
<tr>
194
-
<thscope="row">Design Sprint: Audio UI for CC Search</th>
195
-
<tdscope="row">Designing and prototyping an upcoming user interface for searching for audio on CC Search.</td>
186
+
<thscope="row">Implement architecture for schema for new metadata [AWS Grant]</th>
187
+
<tdscope="row">Update Catalog schema to include new metadata generated through AWS Rekognition.</td>
188
+
</tr>
189
+
190
+
<tr>
191
+
<thscope="row">Image Selection for Rekognition [AWS Grant]</th>
192
+
<tdscope="row">Develop metrics for and select a set of ~100 million high quality images for which we'll generate additional metadata through AWS Rekognition.</td>
196
193
</tr>
197
194
198
195
<tr>
@@ -206,15 +203,28 @@ <h4>Q3 2020</h4>
206
203
</tr>
207
204
208
205
<tr>
209
-
<thscope="row">Plan search algorithm changes for new metadata [AWS Grant]</th>
210
-
<tdscope="row">Plan out search algorithm changes to incorporate image metadata generated via AWS Rekognition.</td>
206
+
<thscope="row">Move data cleaning pipeline from API to Catalog</th>
207
+
<tdscope="row">Move our data cleaning code from the ingestion step of the API to the initial data processing step of the Catalog to eliminate unnecessary repetitive data cleaning.</td>
211
208
</tr>
212
209
213
210
<tr>
214
-
<thscope="row">Implement architecture for schema for new metadata [AWS Grant]</th>
215
-
<tdscope="row">Update Catalog schema to include new metadata generated through AWS Rekognition.</td>
211
+
<thscope="row">Design Sprint: Audio UI for CC Search</th>
212
+
<tdscope="row">Designing and prototyping an upcoming user interface for searching for audio on CC Search.</td>
<tdscope="row">Improve how and where we explain licenses, and consider ways to make it easier for reusers to understand and comply with license requirements.</td>
@@ -250,11 +260,6 @@ <h4>Q3 2020</h4>
250
260
<tdscope="row">Design and user test UIs for audio. Ingest a pilot collection of audio to the Catalog, build support in the API. Integrate design to frontend to allow users to search for CC licensed audio.</td>
251
261
</tr>
252
262
253
-
<tr>
254
-
<thscope="row">Image Selection for Rekognition [AWS Grant]</th>
255
-
<tdscope="row">Develop metrics for and select a set of ~100 million high quality images for which we'll generate additional metadata through AWS Rekognition.</td>
256
-
</tr>
257
-
258
263
<tr>
259
264
<thscope="row">Improve Common Crawl Infrastructure</th>
260
265
<tdscope="row">Update our Common Crawl provider infrastructure to:
0 commit comments