Javadoc

garydgregory · garydgregory · commit a38cf236ca02 · 2023-06-04T11:31:27.000-04:00
Close HTML tags
diff --git a/src/main/java/org/apache/commons/codec/language/bm/BeiderMorseEncoder.java b/src/main/java/org/apache/commons/codec/language/bm/BeiderMorseEncoder.java
@@ -25,12 +25,13 @@
  * <p>
  * Beider-Morse phonetic encodings are optimised for family names. However, they may be useful for a wide range of
  * words.
+ * </p>
  * <p>
  * This encoder is intentionally mutable to allow dynamic configuration through bean properties. As such, it is mutable,
  * and may not be thread-safe. If you require a guaranteed thread-safe encoding then use {@link PhoneticEngine}
  * directly.
- * <p>
- * <b>Encoding overview</b>
+ * </p>
+ * <h2>Encoding overview</h2>
  * <p>
  * Beider-Morse phonetic encodings is a multi-step process. Firstly, a table of rules is consulted to guess what
  * language the word comes from. For example, if it ends in "{@code ault}" then it infers that the word is French.
@@ -42,28 +43,31 @@
  * representation. Again, sometimes there are multiple ways this could be done and sometimes things that can be
  * pronounced in several ways in the source language have only one way to represent them in this average phonetic
  * language, so the result is again a set of phonetic spellings.
+ * </p>
  * <p>
  * Some names are treated as having multiple parts. This can be due to two things. Firstly, they may be hyphenated. In
  * this case, each individual hyphenated word is encoded, and then these are combined end-to-end for the final encoding.
  * Secondly, some names have standard prefixes, for example, "{@code Mac/Mc}" in Scottish (English) names. As
  * sometimes it is ambiguous whether the prefix is intended or is an accident of the spelling, the word is encoded once
  * with the prefix and once without it. The resulting encoding contains one and then the other result.
- * <p>
- * <b>Encoding format</b>
+ * </p>
+ * <h2>Encoding format</h2>
  * <p>
  * Individual phonetic spellings of an input word are represented in upper- and lower-case roman characters. Where there
  * are multiple possible phonetic representations, these are joined with a pipe ({@code |}) character. If multiple
  * hyphenated words where found, or if the word may contain a name prefix, each encoded word is placed in ellipses and
  * these blocks are then joined with hyphens. For example, "{@code d'ortley}" has a possible prefix. The form
  * without prefix encodes to "{@code ortlaj|ortlej}", while the form with prefix encodes to "
  * {@code dortlaj|dortlej}". Thus, the full, combined encoding is "{@code (ortlaj|ortlej)-(dortlaj|dortlej)}".
+ * </p>
  * <p>
  * The encoded forms are often quite a bit longer than the input strings. This is because a single input may have many
  * potential phonetic interpretations. For example, "{@code Renault}" encodes to "
  * {@code rYnDlt|rYnalt|rYnult|rinDlt|rinalt|rinult}". The {@code APPROX} rules will tend to produce larger
  * encodings as they consider a wider range of possible, approximate phonetic interpretations of the original word.
  * Down-stream applications may wish to further process the encoding for indexing or lookup purposes, for example, by
  * splitting on pipe ({@code |}) and indexing under each of these alternatives.
+ * </p>
  * <p>
  * <b>Note</b>: this version of the Beider-Morse encoding is equivalent with v3.4 of the reference implementation.
  * </p>
diff --git a/src/main/java/org/apache/commons/codec/language/bm/Lang.java b/src/main/java/org/apache/commons/codec/language/bm/Lang.java
@@ -36,18 +36,23 @@
  * <p>
  * This class encapsulates rules used to guess the possible languages that a word originates from. This is
  * done by reference to a whole series of rules distributed in resource files.
+ * </p>
  * <p>
  * Instances of this class are typically managed through the static factory method instance().
  * Unless you are developing your own language guessing rules, you will not need to interact with this class directly.
+ * </p>
  * <p>
  * This class is intended to be immutable and thread-safe.
- * <p>
- * <b>Lang resources</b>
+ * </p>
+ * <h2>Lang resources</h2>
  * <p>
  * Language guessing rules are typically loaded from resource files. These are UTF-8 encoded text files.
  * They are systematically named following the pattern:
+ * </p>
  * <blockquote>org/apache/commons/codec/language/bm/lang.txt</blockquote>
+ * <p>
  * The format of these resources is the following:
+ * </p>
  * <ul>
  * <li><b>Rules:</b> whitespace separated strings.
  * There should be 3 columns to each row, and these will be interpreted as:
@@ -65,6 +70,7 @@
  * </ul>
  * <p>
  * Port of lang.php
+ * </p>
  *
  * @since 1.6
  */
@@ -119,6 +125,7 @@ public static Lang instance(final NameType nameType) {
      * <p>
      * In normal use, you will obtain instances of Lang through the {@link #instance(NameType)} method.
      * You will only need to call this yourself if you are developing custom language mapping rules.
+     * </p>
      *
      * @param languageRulesResourceName
      *            the fully-qualified resource name to load
diff --git a/src/main/java/org/apache/commons/codec/language/bm/Languages.java b/src/main/java/org/apache/commons/codec/language/bm/Languages.java
@@ -33,10 +33,12 @@
  * <p>
  * Language codes are typically loaded from resource files. These are UTF-8
  * encoded text files. They are systematically named following the pattern:
+ * </p>
  * <blockquote>org/apache/commons/codec/language/bm/${{@link NameType#getName()}
  * languages.txt</blockquote>
  * <p>
  * The format of these resources is the following:
+ * </p>
  * <ul>
  * <li><b>Language:</b> a single string containing no whitespace</li>
  * <li><b>End-of-line comments:</b> Any occurrence of '//' will cause all text
@@ -48,8 +50,10 @@
  * </ul>
  * <p>
  * Ported from language.php
+ * </p>
  * <p>
  * This class is immutable and thread-safe.
+ * </p>
  *
  * @since 1.6
  */
diff --git a/src/main/java/org/apache/commons/codec/language/bm/NameType.java b/src/main/java/org/apache/commons/codec/language/bm/NameType.java
@@ -26,13 +26,19 @@
  */
 public enum NameType {
 
-    /** Ashkenazi family names */
+    /**
+     * Ashkenazi family names.
+     */
     ASHKENAZI("ash"),
 
-    /** Generic names and words */
+    /**
+     * Generic names and words.
+     */
     GENERIC("gen"),
 
-    /** Sephardic family names */
+    /**
+     * Sephardic family names.
+     */
     SEPHARDIC("sep");
 
     private final String name;
diff --git a/src/main/java/org/apache/commons/codec/language/bm/PhoneticEngine.java b/src/main/java/org/apache/commons/codec/language/bm/PhoneticEngine.java
@@ -41,12 +41,15 @@
  * into account the likely source language. Next, this phonetic representation is converted into a
  * pan-European 'average' representation, allowing comparison between different versions of essentially
  * the same word from different languages.
+ * </p>
  * <p>
  * This class is intentionally immutable and thread-safe.
  * If you wish to alter the settings for a PhoneticEngine, you
  * must make a new one with the updated settings.
+ * </p>
  * <p>
  * Ported from phoneticengine.php
+ * </p>
  *
  * @since 1.6
  */
@@ -97,6 +100,7 @@ public void append(final CharSequence str) {
          * <p>
          * This will lengthen phonemes that have compatible language sets to the expression, and drop those that are
          * incompatible.
+         * </p>
          *
          * @param phonemeExpr   the expression to apply
          * @param maxPhonemes   the maximum number of phonemes to build up
@@ -237,6 +241,7 @@ public boolean isFound() {
 
     /**
      * Joins some strings with an internal separator.
+     *
      * @param strings   Strings to join
      * @param sep       String to separate them with
      * @return a single String consisting of each element of {@code strings} interleaved by {@code sep}
diff --git a/src/main/java/org/apache/commons/codec/language/bm/Rule.java b/src/main/java/org/apache/commons/codec/language/bm/Rule.java
@@ -39,6 +39,7 @@
  * <p>
  * Rules have a pattern, left context, right context, output phoneme, set of languages for which they apply
  * and a logical flag indicating if all languages must be in play. A rule matches if:
+ * </p>
  * <ul>
  * <li>the pattern matches at the current position</li>
  * <li>the string up until the beginning of the pattern matches the left context</li>
@@ -49,16 +50,19 @@
  * <p>
  * Rules are typically generated by parsing rules resources. In normal use, there will be no need for the user
  * to explicitly construct their own.
+ * </p>
  * <p>
  * Rules are immutable and thread-safe.
- * <p>
- * <b>Rules resources</b>
+ * </p>
+ * <h2>Rules resources</h2>
  * <p>
  * Rules are typically loaded from resource files. These are UTF-8 encoded text files. They are systematically
  * named following the pattern:
+ * </p>
  * <blockquote>org/apache/commons/codec/language/bm/${NameType#getName}_${RuleType#getName}_${language}.txt</blockquote>
  * <p>
  * The format of these resources is the following:
+ * </p>
  * <ul>
  * <li><b>Rules:</b> whitespace separated, double-quoted strings. There should be 4 columns to each row, and these
  * will be interpreted as:
diff --git a/src/main/java/org/apache/commons/codec/language/bm/RuleType.java b/src/main/java/org/apache/commons/codec/language/bm/RuleType.java
@@ -24,11 +24,19 @@
  */
 public enum RuleType {
 
-    /** Approximate rules, which will lead to the largest number of phonetic interpretations. */
+    /**
+     * Approximate rules, which will lead to the largest number of phonetic interpretations.
+     */
     APPROX("approx"),
-    /** Exact rules, which will lead to a minimum number of phonetic interpretations. */
+
+    /**
+     * Exact rules, which will lead to a minimum number of phonetic interpretations.
+     */
     EXACT("exact"),
-    /** For internal use only. Please use {@link #APPROX} or {@link #EXACT}. */
+
+    /**
+     * For internal use only. Please use {@link #APPROX} or {@link #EXACT}.
+     */
     RULES("rules");
 
     private final String name;
diff --git a/src/main/java/org/apache/commons/codec/net/QuotedPrintableCodec.java b/src/main/java/org/apache/commons/codec/net/QuotedPrintableCodec.java
@@ -247,6 +247,7 @@ private static boolean isWhitespace(final int b) {
      * <p>
      * This function implements a subset of quoted-printable encoding specification (rule #1 and rule #2) as defined in
      * RFC 1521 and is suitable for encoding binary data and unformatted text.
+     * </p>
      *
      * @param printable
      *            bitset of characters deemed quoted-printable
@@ -264,6 +265,7 @@ public static final byte[] encodeQuotedPrintable(final BitSet printable, final b
      * Depending on the selection of the {@code strict} parameter, this function either implements the full ruleset
      * or only a subset of quoted-printable encoding specification (rule #1 and rule #2) as defined in
      * RFC 1521 and is suitable for encoding binary data and unformatted text.
+     * </p>
      *
      * @param printable
      *            bitset of characters deemed quoted-printable
@@ -347,6 +349,7 @@ public static final byte[] encodeQuotedPrintable(BitSet printable, final byte[]
      * <p>
      * This function fully implements the quoted-printable encoding specification (rule #1 through rule #5) as
      * defined in RFC 1521.
+     * </p>
      *
      * @param bytes
      *            array of quoted-printable characters
@@ -387,6 +390,7 @@ public static final byte[] decodeQuotedPrintable(final byte[] bytes) throws Deco
      * Depending on the selection of the {@code strict} parameter, this function either implements the full ruleset
      * or only a subset of quoted-printable encoding specification (rule #1 and rule #2) as defined in
      * RFC 1521 and is suitable for encoding binary data and unformatted text.
+     * </p>
      *
      * @param bytes
      *            array of bytes to be encoded
@@ -403,6 +407,7 @@ public byte[] encode(final byte[] bytes) {
      * <p>
      * This function fully implements the quoted-printable encoding specification (rule #1 through rule #5) as
      * defined in RFC 1521.
+     * </p>
      *
      * @param bytes
      *            array of quoted-printable characters
@@ -421,6 +426,7 @@ public byte[] decode(final byte[] bytes) throws DecoderException {
      * Depending on the selection of the {@code strict} parameter, this function either implements the full ruleset
      * or only a subset of quoted-printable encoding specification (rule #1 and rule #2) as defined in
      * RFC 1521 and is suitable for encoding binary data and unformatted text.
+     * </p>
      *
      * @param sourceStr
      *            string to convert to quoted-printable form
@@ -571,6 +577,7 @@ public String getDefaultCharset() {
      * Depending on the selection of the {@code strict} parameter, this function either implements the full ruleset
      * or only a subset of quoted-printable encoding specification (rule #1 and rule #2) as defined in
      * RFC 1521 and is suitable for encoding binary data and unformatted text.
+     * </p>
      *
      * @param sourceStr
      *            string to convert to quoted-printable form
@@ -592,6 +599,7 @@ public String encode(final String sourceStr, final Charset sourceCharset) {
      * Depending on the selection of the {@code strict} parameter, this function either implements the full ruleset
      * or only a subset of quoted-printable encoding specification (rule #1 and rule #2) as defined in
      * RFC 1521 and is suitable for encoding binary data and unformatted text.
+     * </p>
      *
      * @param sourceStr
      *            string to convert to quoted-printable form
diff --git a/src/main/java/org/apache/commons/codec/net/RFC1522Codec.java b/src/main/java/org/apache/commons/codec/net/RFC1522Codec.java
@@ -56,6 +56,7 @@ abstract class RFC1522Codec {
      * <p>
      * This method constructs the "encoded-word" header common to all the RFC 1522 codecs and then invokes
      * {@link #doEncoding(byte[])}  method of a concrete class to perform the specific encoding.
+     * </p>
      *
      * @param text
      *            a string to encode
@@ -86,6 +87,7 @@ protected String encodeText(final String text, final Charset charset) throws Enc
      * <p>
      * This method constructs the "encoded-word" header common to all the RFC 1522 codecs and then invokes
      * {@link #doEncoding(byte[])}  method of a concrete class to perform the specific encoding.
+     * </p>
      *
      * @param text
      *            a string to encode
@@ -112,6 +114,7 @@ protected String encodeText(final String text, final String charsetName)
      * <p>
      * This method processes the "encoded-word" header common to all the RFC 1522 codecs and then invokes
      * {@link #doDecoding(byte[])}  method of a concrete class to perform the specific decoding.
+     * </p>
      *
      * @param text
      *            a string to decode