Skip to content

Commit cfb9d1a

Browse files
committed
correct missing periods and transcribing errors
1 parent 285e0fc commit cfb9d1a

File tree

1 file changed

+8
-7
lines changed
  • content/blog/entries/2022-12-07-berkeley-quantifying

1 file changed

+8
-7
lines changed

content/blog/entries/2022-12-07-berkeley-quantifying/contents.lr

+8-7
Original file line numberDiff line numberDiff line change
@@ -56,10 +56,10 @@ creativecommons.org webpage that explains the license's rules (the deed).
5656

5757
Therefore, we may use the following approach to identify and count CC-licensed
5858
documents:
59-
1. Select a list of CC tools to inspect (provided by CC)
59+
1. Select a list of CC tools to inspect (provided by CC).
6060
2. Use APIs of different online platforms to detect and count documents that
6161
are labeled as license by platform and/or contains a hyperlink towards CC
62-
license webpages
62+
license webpages.
6363
3. Store these data in tabular form to contain the count of documents protected
6464
under each type of CC tools.
6565

@@ -179,8 +179,9 @@ Africa should be encouraged.
179179
Barplot for number of webpages protected by six primary CC licenses
180180
![Barplot for number of webpages protected by six primary CC licenses](diagram_3c.png)
181181

182-
We that **Attribution** (BY) and **Attribution-Nonderivative (BY-ND) are
183-
popular licenses** among the 3 billion documents sampled across the dataset
182+
We can see that **Attribution** (BY) and **Attribution-Nonderivative (BY-ND)
183+
are popular licenses** among the 3 billion documents sampled across the
184+
dataset.
184185

185186

186187
#### Diagram 6
@@ -277,10 +278,10 @@ resources, metrics, under different modeling contexts:
277278

278279
#### Model of Google Webpages (Dun-Ming Huang)
279280

280-
- Modeling Context: Multiclass Classifier (7 classes)
281+
- Modeling Context: Multiclass Classifier (7 classes).
281282
- Modeling Training set: Text webpage contents acquired from Google API
282-
collected webpages. (Common Crawl, the original choice, was marked
283-
unavailable due to source code corruption)
283+
collected webpages (Common Crawl, the original choice, was marked
284+
unavailable due to source code corruption).
284285
- Main Model Metric: Top-k accuracy, as this model is considered as the backend
285286
of a license recommendation system that receives webpage content and
286287
recommend 2 to 3 licenses to the user.

0 commit comments

Comments
 (0)