Skip to content

Commit 48c15b1

Browse files
committed
Synchronized build
1 parent b3108a5 commit 48c15b1

File tree

1 file changed

+42
-62
lines changed

1 file changed

+42
-62
lines changed

cc-search/index.html

+42-62
Original file line numberDiff line numberDiff line change
@@ -180,43 +180,33 @@ <h4>Q3 2020</h4>
180180
</tr>
181181

182182
<tr>
183-
<th scope="row">Improve Catalog Deployment and Provisioning</th>
184-
<td scope="row">Manage Catalog deployment and provisioning entirely through infrastructure as code.</td>
185-
</tr>
186-
187-
<tr>
188-
<th scope="row">Improve Documentation for Community Contributors</th>
189-
<td scope="row">Create better documentation for community contributors by consolidating internal and public documentation and making it available for everyone.</td>
183+
<th scope="row">Implement architecture for schema for new metadata [AWS Grant]</th>
184+
<td scope="row">Update Catalog schema to include new metadata generated through AWS Rekognition.</td>
190185
</tr>
191186

192187
<tr>
193188
<th scope="row">Plan search algorithm changes for new metadata [AWS Grant]</th>
194189
<td scope="row">Plan out search algorithm changes to incorporate image metadata generated via AWS Rekognition.</td>
195190
</tr>
196191

197-
<tr>
198-
<th scope="row">Implement architecture for schema for new metadata [AWS Grant]</th>
199-
<td scope="row">Update Catalog schema to include new metadata generated through AWS Rekognition.</td>
200-
</tr>
201-
202192
<tr>
203193
<th scope="row">License Explanation/Compliance Improvements</th>
204194
<td scope="row">Improve how and where we explain licenses, and consider ways to make it easier for reusers to understand and comply with license requirements.</td>
205195
</tr>
206196

207197
<tr>
208-
<th scope="row">Improved Support Pages</th>
209-
<td scope="row">Improve the support pages on CC Search, which includes the Collections page, for a better experience. Add explanation text for collections, improve flow.</td>
198+
<th scope="row">Offline old CC Search</th>
199+
<td scope="row">Offline Old Search (oldsearch.creativecommons.org) and redirect traffic to CC Search. Prior to this, build in messaging on Old Search, and support similar functionality on CC Search. See &#34;Meta Search Integration&#34; for related work.</td>
210200
</tr>
211201

212202
<tr>
213-
<th scope="row">Design Sprint: Meta Search Integration</th>
214-
<td scope="row">Integrating meta search functionality into CC Search for sources that are not currently indexed, and content types we do not currently support.</td>
203+
<th scope="row">Web Monetization: Phase 1</th>
204+
<td scope="row">Research and test potential integrations for Web Monetization into CC Search and other CC web properties.</td>
215205
</tr>
216206

217207
<tr>
218-
<th scope="row">Offline old CC Search</th>
219-
<td scope="row">Offline Old Search (oldsearch.creativecommons.org) and redirect traffic to CC Search. Prior to this, build in messaging on Old Search, and support similar functionality on CC Search. See &#34;Meta Search Integration&#34; for related work.</td>
208+
<th scope="row">Improved Support Pages</th>
209+
<td scope="row">Improve the support pages on CC Search, which includes the Collections page, for a better experience. Add explanation text for collections, improve flow.</td>
220210
</tr>
221211

222212
<tr>
@@ -229,16 +219,6 @@ <h4>Q3 2020</h4>
229219
<td scope="row">Build infrastructure necessary for internationalization, to allow CC Search to be accessible in other languages.</td>
230220
</tr>
231221

232-
<tr>
233-
<th scope="row">Design Sprint: Audio UI for CC Search</th>
234-
<td scope="row">Designing and prototyping an upcoming user interface for searching for audio on CC Search.</td>
235-
</tr>
236-
237-
<tr>
238-
<th scope="row">Audio Support and Integration</th>
239-
<td scope="row">Design and user test UIs for audio. Ingest a pilot collection of audio to the Catalog, build support in the API. Integrate design to frontend to allow users to search for CC licensed audio.</td>
240-
</tr>
241-
242222
<tr>
243223
<th scope="row">Improve Common Crawl Infrastructure</th>
244224
<td scope="row">Update our Common Crawl provider infrastructure to:
@@ -247,43 +227,51 @@ <h4>Q3 2020</h4>
247227
</tr>
248228

249229
<tr>
250-
<th scope="row">Use Data Dumps for Wikimedia Ingestion</th>
251-
<td scope="row">Switch our Catalog data ingestion for Wikimedia Commons to use the data dumps provided by Wikimedia instead of the MediaWiki API.</td>
230+
<th scope="row">Design Sprint: Audio UI for CC Search</th>
231+
<td scope="row">Designing and prototyping an upcoming user interface for searching for audio on CC Search.</td>
252232
</tr>
253233

254234
<tr>
255-
<th scope="row">Web Monetization: Phase 1</th>
256-
<td scope="row">Research and test potential integrations for Web Monetization into CC Search and other CC web properties.</td>
235+
<th scope="row">Audio Support and Integration</th>
236+
<td scope="row">Design and user test UIs for audio. Ingest a pilot collection of audio to the Catalog, build support in the API. Integrate design to frontend to allow users to search for CC licensed audio.</td>
257237
</tr>
258238

259239
<tr>
260-
<th scope="row">Scraping &amp; Resizing Work [AWS Grant]</th>
240+
<th scope="row">Scraping &amp; Resizing Work for Rekognition [AWS Grant]</th>
261241
<td scope="row">Store a private copy of all the images in the CC Catalog to analyze via machine learning.</td>
262242
</tr>
263243

264244
<tr>
265-
<th scope="row">Wikidata integration with Catalog &amp; Search Algorithm</th>
266-
<td scope="row">Collect and use structured data from Wikidata to enhance our search algorithm with semantic search.</td>
267-
</tr>
268-
269-
<tr>
270-
<th scope="row">Usage/Reuse Metrics Dashboard</th>
271-
<td scope="row">Build an analytics UI that is fed by Google Analytics and our internal analytics database.</td>
245+
<th scope="row">Run Rekognition on 100m images [AWS Grant]</th>
246+
<td scope="row">Generate metadata via machine learning (using AWS Rekognition) on a set of ~100 million high quality images from the CC Catalog.</td>
272247
</tr>
273248

274249
<tr>
275250
<th scope="row">Switch from Common Crawl to API</th>
276251
<td scope="row">For all possible providers, use their APIs to ingest data into the CC Catalog instead of scraping websites via Common Crawl data.</td>
277252
</tr>
278253

254+
</tbody>
255+
</table>
256+
257+
<h4>Q4 2020</h4>
258+
<table class="table table-striped">
259+
<thead class="thead-dark">
260+
<tr>
261+
<th scope="col">Task Name</th>
262+
<th scope="col">Task Description</th>
263+
</tr>
264+
</thead>
265+
<tbody>
266+
279267
<tr>
280-
<th scope="row">Run Rekognition on 100m images [AWS Grant]</th>
281-
<td scope="row">Generate metadata via machine learning (using AWS Rekognition) on a set of ~100 million high quality images from the CC Catalog.</td>
268+
<th scope="row">Search Relevance Improvements: Language Analysis, Quality Metrics, Minimums</th>
269+
<td scope="row">None</td>
282270
</tr>
283271

284272
<tr>
285-
<th scope="row">Upgrade Catalog: Data Lake</th>
286-
<td scope="row">Upgrade the CC Catalog database to use a schema-less database instead of the relational database (Postgres) that we currently use.</td>
273+
<th scope="row">Plan UI Updates in Response to Metadata [AWS Grant]</th>
274+
<td scope="row">Design updates to the CC Search UI in response to new metadata available as a result of applying machine learning to selected images in the Catalog. At a minimum, we expect new filters will be an option. Integration of design will take place subsequently.</td>
287275
</tr>
288276

289277
<tr>
@@ -292,32 +280,24 @@ <h4>Q3 2020</h4>
292280
</tr>
293281

294282
<tr>
295-
<th scope="row">Implement Use of Thumbnails in Search &amp; Catalog [AWS Grant]</th>
296-
<td scope="row">Implement changes to CC Search (frontend) and Catalog to make use of thumbnails, as they become available.</td>
283+
<th scope="row">Usage/Reuse Metrics Dashboard</th>
284+
<td scope="row">Build an analytics UI that is fed by Google Analytics and our internal analytics database.</td>
297285
</tr>
298286

299287
<tr>
300-
<th scope="row">Partnership guidelines for all integration types</th>
301-
<td scope="row">Prepare partnership guidelines for CC Search. Create a page on CC Search publishing these guidelines.</td>
288+
<th scope="row">Scrape all images and set up feed for new ones</th>
289+
<td scope="row">Once the Rekognition crawl finishes, we want to crawl the rest of the catalog (but not feed them to rekognition). This will give us useful metadata like dimensions and quality.</td>
302290
</tr>
303291

304292
<tr>
305-
<th scope="row">Plan UI Updates in Response to Metadata [AWS Grant]</th>
306-
<td scope="row">Design updates to the CC Search UI in response to new metadata available as a result of applying machine learning to selected images in the Catalog. At a minimum, we expect new filters will be an option. Integration of design will take place subsequently.</td>
293+
<th scope="row">Improve Documentation for Community Contributors</th>
294+
<td scope="row">Create better documentation for community contributors by consolidating internal and public documentation and making it available for everyone.</td>
307295
</tr>
308296

309-
</tbody>
310-
</table>
311-
312-
<h4>Q4 2020</h4>
313-
<table class="table table-striped">
314-
<thead class="thead-dark">
315-
<tr>
316-
<th scope="col">Task Name</th>
317-
<th scope="col">Task Description</th>
318-
</tr>
319-
</thead>
320-
<tbody>
297+
<tr>
298+
<th scope="row">Improve Catalog Deployment and Provisioning</th>
299+
<td scope="row">Manage Catalog deployment and provisioning entirely through infrastructure as code.</td>
300+
</tr>
321301

322302
<tr>
323303
<th scope="row">API documentation improvements</th>

0 commit comments

Comments
 (0)