python文件处理基础1

2022-09-07 152

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

简介： python文件处理基础1

文件的基础操作

打开/关闭文件

内建函数open, 能够打开一个指定路径下的文件, 返回一个文件对象.

open最常用的有两个参数, 第一个参数是文件名(绝对路径或者相对路径), 第二个是打开方式

'r'/'w'/'a'/'b',表示读(默认)/写/追加写/二进制.
注意:打开文件一定要记得关闭

a = open('Z:/test.txt','r') #注意不是反斜杠,Z盘要大写

a.close() #关闭文件

关于内建函数:

我们反复遇到了 "内建函数" 这个词. 内建函数其实是包含在 __builtins__ 这个模块中的一些函数.

而 __builtins__ 这个模块Python解释器会自动包含.

使用 dir(__builtins__) 可以看到Python中一共有哪些内建函数

print(dir(__builtins__))

#执行结果:

['ArithmeticError', 'AssertionError', 'AttributeError', 'BaseException', 'BlockingIOError', 'BrokenPipeError', 'BufferError', 'BytesWarning', 'ChildProcessError', 'ConnectionAbortedError', 'ConnectionError', 'ConnectionRefusedError', 'ConnectionResetError', 'DeprecationWarning', 'EOFError', 'Ellipsis', 'EnvironmentError', 'Exception', 'False', 'FileExistsError', 'FileNotFoundError', 'FloatingPointError', 'FutureWarning', 'GeneratorExit', 'IOError', 'ImportError', 'ImportWarning', 'IndentationError', 'IndexError', 'InterruptedError', 'IsADirectoryError', 'KeyError', 'KeyboardInterrupt', 'LookupError', 'MemoryError', 'ModuleNotFoundError', 'NameError', 'None', 'NotADirectoryError', 'NotImplemented', 'NotImplementedError', 'OSError', 'OverflowError', 'PendingDeprecationWarning', 'PermissionError', 'ProcessLookupError', 'RecursionError', 'ReferenceError', 'ResourceWarning', 'RuntimeError', 'RuntimeWarning', 'StopAsyncIteration', 'StopIteration', 'SyntaxError', 'SyntaxWarning', 'SystemError', 'SystemExit', 'TabError', 'TimeoutError', 'True', 'TypeError', 'UnboundLocalError', 'UnicodeDecodeError', 'UnicodeEncodeError', 'UnicodeError', 'UnicodeTranslateError', 'UnicodeWarning', 'UserWarning', 'ValueError', 'Warning', 'WindowsError', 'ZeroDivisionError', '__build_class__', '__debug__', '__doc__', '__import__', '__loader__', '__name__', '__package__', '__spec__', 'abs', 'all', 'any', 'ascii', 'bin', 'bool', 'breakpoint', 'bytearray', 'bytes', 'callable', 'chr', 'classmethod', 'compile', 'complex', 'copyright', 'credits', 'delattr', 'dict', 'dir', 'divmod', 'enumerate', 'eval', 'exec', 'exit', 'filter', 'float', 'format', 'frozenset', 'getattr', 'globals', 'hasattr', 'hash', 'help', 'hex', 'id', 'input', 'int', 'isinstance', 'issubclass', 'iter', 'len', 'license', 'list', 'locals', 'map', 'max', 'memoryview', 'min', 'next', 'object', 'oct', 'open', 'ord', 'pow', 'print', 'property', 'quit', 'range', 'repr', 'reversed', 'round', 'set', 'setattr', 'slice', 'sorted', 'staticmethod', 'str', 'sum', 'super', 'tuple', 'type', 'vars', 'zip']

关于文件对象:

我们学习C语言知道 FILE* , 通过 FILE* 进行文件读写操作.

我们学习Linux时又知道, FILE* 结构中其实包含了文件描述符, 操作系统是通过文件描述符来对文件操作的

Python的文件对象, 其实也包含了文件描述符, 同时也包含了这个文件的一些其他属性. 本质上也是通过文件描述符完成对文件的读写操作.

既然文件对象包含了文件描述符, 我们知道, 一个进程可操作的文件描述符的数目是有上限的. 因此对于用完了的文件描述符要及时关闭.

当文件对象被垃圾回收器销毁时, 也会同时释放文件描述符.

如果文件打开失败(例如文件不存在), 就会执行出错

a = open('Z:\XXX','r')

#执行结果：

FileNotFoundError: [Errno2] Nosuchfileordirectory: 'Z:\\XXX'

读文件

read: 读指定长度字节数的数据, 返回一个字符串.

a = open('Z:/test.txt','r')

print(a.read())

#执行结果：

helloworld

hellopython

readline: 读取一行数据, 返回一个字符串

a = open('Z:/test.txt','r')

print(a.readline()) #hello world

readlines: 读取整个文件, 返回一个列表. 列表中的每一项是一个字符串, 代表了一行的内容

a = open('Z:/test.txt','r')

print(a.readlines()) #['hello world\n', 'hello python']

直接使用for line in f的方式循环遍历每一行, 功能和readline类似. 一次只读一行,

相比于readlines占用内存少, 但是访问IO设备的次数会增多, 速度较慢

a = open('Z:/test.txt','r')

forlineina:

print(line)

#执行结果：

helloworld

hellopython

注意, readline或者readlines这些函数仍然会保留换行符. 所以我们往往需要写这样的代码来去掉换行符.

a = open('Z:/test.txt','r')

# for line in f 的方式循环遍历每一行, 功能和readline类似,返回的line是字符串,所以可以使用字符串的成员函数

forlineina:

print(line.strip())

#执行结果：

helloworld

hellopython

#或者:使用列表解析语法

a = open('Z:/test.txt','r')

data = [line.strip() forlineina.readlines()]

print(data) #['hello world', 'hello python']

readlines和for line in f: 的区别：

第一种方法是全部读取->只读取一次时间快,但是占空间。第二种方式是隔行读取->读取多次,时间慢,但是省空间。读取大文件选方式2

写文件

write: 向文件中写一段字符串

如需写文件, 必须要按照 'w' 或者 'a' 的方式打开文件. 否则会写失败.

a = open('Z:/test.txt','r')

a.write("hello Mango") #io.UnsupportedOperation: not writable

a = open('Z:/test.txt','w')

a.write("hello Mango") #用w方式打开,原文件的内容被删除

a = open('Z:/test.txt','a')

a.write("hello Lemon") #以a的方式打开->追加

writelines: 参数是一个列表, 列表中的每一个元素是一个字符串.

a = open('Z:/test.txt','w')

w = ['Mango\n','hello\n',' world\n']

a.writelines(w)#把列表的内容写入到文件中

并没有一个 writeline 这样的函数. 因为这个动作等价于 write 时在字符串后面加上 '\n'. 同理, 使用writelines的时候, 也需要保证每一个元素的末尾, 都带有 '\n'

关于读写缓冲区

学习Linux我们知道, C语言库函数中的fread, fwrite和系统调用read, write相比, 功能是类似的. 但是fread/fwrite是带有缓冲区的

Python的文件读写操作, 既可以支持带缓冲区, 也可以选择不带缓冲区.

在使用open函数打开一个文件的时候, 其实还有第三个参数, 可以指定是否使用缓冲区, 以及缓冲区的大小是多少(查看 help(open) 以及 print(__doc__) ).

a = open('Z:/test.txt','r')

print(a.__doc__)

print(help(open))

使用flush方法可以立即刷新缓冲区

python文件处理基础1

文件的基础操作

打开/关闭文件

关于内建函数:

关于文件对象:

读文件

写文件

关于读写缓冲区

热门文章

最新文章

相关课程

相关电子书

相关实验场景

推荐镜像

热门

活动广场

任务中心

开发者评测

高校计划

乘风者计划

训练营

阿里云MVP

话题

直播

下载

镜像站

技术资料

插件

python文件处理基础1

文件的基础操作

打开/关闭文件

关于内建函数:

关于文件对象:

读文件

写文件

关于读写缓冲区

热门文章

最新文章

相关课程

相关电子书

相关实验场景

推荐镜像