Loading [Contrib]/a11y/accessibility-menu.js

移除字幕檔中的英文句子,保留中文的句子

如果一個字幕檔,包含類似這樣的內容:

00:00:12:05 --> 00:00:12:20
Welcome.
歡迎大家。

要把這樣的檔案中英文句子的部分移除,可以用下列的 python 程式碼來處理。

def is_chinese(string):
for ch in string:
if u'\u4e00' <= ch <= u'\u9ffff':
return True
return False
with open('MindfulnessofBreathing3.txt', mode='r') as in_file, \
open('MindfulnessofBreathing3zh.txt', mode='w') as out_file:
for line in in_file:
if (line[0].isnumeric()):
# print(line)
out_file.write(line)
elif (is_chinese(line)):
# print(line)
out_file.write(line)
out_file.write("\n")