Brian Foley
8bb56eeeea
[utils] Add extract_attributes for extracting html tag attributes
This is much more robust than just using regexps, and handles all
the common scenarios, such as empty/no values, repeated attributes,
entity decoding, mixed case names, and the different possible value
quoting schemes.
9 years ago
Yen Chi Hsuan
5eb6bdced4
[utils] Multiple changes to base_n()
1. Renamed to encode_base_n()
2. Allow tables longer than 62 characters
3. Raise ValueError instead of AssertionError for invalid input data
4. Return the first character in the table instead of '0' for number 0
5. Add tests
9 years ago
Sergey M․
f160785c5c
[utils] Remove AM/PM from unified_strdate patterns
9 years ago
Yen Chi Hsuan
5bc880b988
[utils] Add OHDave's RSA encryption function
9 years ago
Sergey M․
8411229bd5
[utils] Allow dot in strip_jsonp
9 years ago
Sergey M․
86296ad2cd
[utils] Add ability to control skipping false values in dict_get
9 years ago
Sergey M․
cbecc9b903
[utils] Add dict_get convenience method
9 years ago
Sergey M․
6b77d52b1f
[test_utils] Add tests for encode_compat_str
9 years ago
Yen Chi Hsuan
db2fe38b55
[utils] Support alternative timestamp format in TTML
Fixes #7608
9 years ago
Yen Chi Hsuan
d631d5f9f2
[utils] Fix TTML conversion
Tolerate invalid timestamps (closes #7909 )
9 years ago
Sergey M․
31b2051e21
[utils] Add remove_quotes
9 years ago
Sergey M․
9cb9a5df77
[utils] Check ext with trailing slash against the list of known extensions
9 years ago
Sergey M․
5035536e3f
[test_utils] Add tests for determine_ext
9 years ago
Sergey M․
7aefc49c40
[utils] Skip invalid/non HTML entities ( Closes #7518 )
9 years ago
Jaime Marquínez Ferrándiz
6a75040278
[utils] unified_strdate: Return None if the date format can't be recognized ( fixes #7340 )
This issue was introduced with ae12bc3ebb
, it returned 'None'.
9 years ago
Sergey M․
578c074575
[utils] Support list of xpath in xpath_element
9 years ago
Sergey M․
52c3a6e49d
[utils] Improve parse_iso8601
9 years ago
Jaime Marquínez Ferrándiz
36e6f62cd0
Use a wrapper around xml.etree.ElementTree.fromstring in python 2.x ( #7178 )
Attributes aren't unicode objects, so they couldn't be directly used in info_dict fields (for example '--write-description' doesn't work with bytes).
9 years ago
Sergey M․
d01949dc89
[utils:js_to_json] Fix bad escape in double quoted strings
9 years ago
Sergey M․
f71264490c
[test_utils] Add tests for cli option converters
9 years ago
Sergey M․
87f70ab39d
[test_utils] Add more tests for xpath
9 years ago
Sergey M․
ee114368ad
[utils] Make value optional for find_xpath_attr
This allows selecting particular attributes by name but without specifying the value and similar to xpath syntax `[@attrib ]`
10 years ago
Yen Chi Hsuan
9c29bc69f7
[utils] Improve parse_duration
Now dots are parsed. For example '87 Min.'
10 years ago
Yen Chi Hsuan
1b0427e6c4
[utils] Support TTML without default namespace
In a strict sense such TTML is invalid, but Yahoo uses it.
10 years ago
Yen Chi Hsuan
7dff03636a
[utils] Support 'dur' field in TTML
10 years ago
Yen Chi Hsuan
d39e0f05db
[utils] Remove sanitize_url_path_consecutive_slashes()
This function is used only in SohuIE, which is updated to use a new
extraction logic.
10 years ago
Yen Chi Hsuan
0fe2ff78e6
[NBC] Enhance embedURL extraction ( closes #2549 )
10 years ago
Sergey M․
b3ed15b760
[utils] Add replace_extension
10 years ago
Sergey M․
a4bcaad773
[test_utils] Add tests for prepend_extension
10 years ago
Yen Chi Hsuan
bf6427d2fb
[ffmpeg] Add dfxp (TTML) subtitles support ( #3432 , #5146 )
10 years ago
Yen Chi Hsuan
0a1603634b
[utils] Remove url_infer_protocol
10 years ago
Yen Chi Hsuan
418c5cc3fc
[udn] Add new extractor
10 years ago
Sergey M․
8cf70de428
[test_utils] Add test for unified_strdate
10 years ago
Sergey M․
ba9e68f402
[utils] Drop trailing comma before closing brace
10 years ago
Naglis Jonaitis
91757b0f37
[utils] Escape all HTML entities written in hexadecimal form
10 years ago
Jaime Marquínez Ferrándiz
5379a2d40d
[test/utils] Test xpath_text
10 years ago
Sergey M․
92a4793b3c
[utils] Place sanitize url function near other sanitizing functions
10 years ago
Sergey M․
2ebfeacabc
[utils] Keep dot and dotdot unmodified ( Closes #5171 )
10 years ago
Sergey M․
f18ef2d144
[utils] Disallow trailing dot in sanitize_path for a path part
10 years ago
Sergey M․
a2aaf4dbc6
[utils] Add sanitize_path
10 years ago
Yen Chi Hsuan
55969016e9
[utils] Add a function to sanitize consecutive slashes in URLs
10 years ago
Philipp Hagemeister
a7440261c5
[utils] Streap leading dots
Fixes #2865 , closes #5087
10 years ago
Philipp Hagemeister
3e675fabe0
[airmozilla] Be more tolerant when nonessential items are missing ( #5030 )
10 years ago
Philipp Hagemeister
5a42414b9c
[utils] Prevent hyphen at beginning of filename ( Fixes #5035 )
10 years ago
Philipp Hagemeister
d305dd73a3
[utils] Fix js_to_json
Previously, the runtime could be atrocious for longer inputs.
10 years ago
Philipp Hagemeister
347de4931c
[YoutubeDL] Add generic video filtering ( Fixes #4916 )
This functionality is intended to eventually encompass the current format filtering.
10 years ago
Philipp Hagemeister
9bb8e0a3f9
[wsj] Add new extractor ( Fixes #4854 )
10 years ago
Philipp Hagemeister
8f4b58d70e
[ntvde] Add new extractor ( Fixes #4850 )
10 years ago
Philipp Hagemeister
cfb56d1af3
Add --list-thumbnails
10 years ago
Philipp Hagemeister
61ca9a80b3
[generic] Add support for BOMs ( Fixes #4753 )
10 years ago