So you want to know how to copy a file in Python? Good! It's very useful to learn and most complex applications that you may design will need at least some form of copying files.
Copying a Single File in Python
Alright, let's get started. This first section will describe how to copy a single file (not a directory) to another location on the hard disk.
Python has a special module called shutil
for simple, high level file operations that is useful when copying single files.
Here's an example of a function that will copy a single file to a destination file or folder (with error handling/reporting):
[python]
import shutil
def copyFile(src, dest):
try:
shutil.copy(src, dest)
# eg. src and dest are the same file
except shutil.Error as e:
print('Error: %s' % e)
# eg. source or destination doesn't exist
except IOError as e:
print('Error: %s' % e.strerror)
[/python]
And that's it! We just call that method, and the file is copied. If the source or destination file doesn't exist, we print an error notifying the user that the operation has failed. If the source and destination files are the same, we don't copy them and notify the user of the failed operation.
Python shutil's Different Copy Methods
The module shutil
has several methods for copying files other than the simple copy
method that we have seen above.
I'll go over them in some detail here, explaining the differences between them and situations where we might need them.
shutil.copyfileobj(fsrc, fdst[, buffer_length])
This function allows copying of files with the actual file objects themselves. If you've already opened a file to read from and a file to write to using the built-in open
function, then you would use shutil.copyfileobj
. It is also of interest to use this function when it is necessary to specify the buffer length of the copy operation. It may help, when copying large files, to increase the buffer length from its default value of 16 KB in order to speed up the copy operation.
All of the other copy functions mentioned below call this function at some point. It is the "base" copy method.
Let's look at a benchmark for copying files using a 50 KB, 100 KB, 500 KB, 1 MB, 10 MB, and 100 MB buffer size vs a normal copy operation. We will test an archived file in iso format of 3.2 GB.
We'll be using this function to specify the buffer size:
[python]
def copyLargeFile(src, dest, buffer_size=16000):
with open(src, 'rb') as fsrc:
with open(dest, 'wb') as fdest:
shutil.copyfileobj(fsrc, fdest, buffer_size)
[/python]
And Ubuntu's built in time
bash command to time the operation.
Here are the results:
50 KB: 29.539s
100 KB: 27.423s
500 KB: 25.245s
1 MB: 26.261s
10 MB: 25.521s
100 MB: 24.886s
As you can see, there is quite a big difference between the buffer sizes. Almost a 16% decrease in the amount of time it took using a 50 KB buffer size to using a 100 MB buffer size.
The optimal buffer size ultimately depends on the amount of RAM you have available as well as the file size.
shutil.copyfile(src, dst)
This method copies a file from the source, src
, to the destination, dst
. This differs from copy
in that you must ensure that the destination path exists and also contains the file name. For example, '/home/'
would be invalid because it's the name of a directory. '/home/test.txt'
would be valid because it contains a file name.
shutil.copy(src, dst)
The copy
that we used above detects if the destination path contains a file name or not. If the path doesn't contain a file name, copy
uses the original file name in the copy operation. It also copies the permission bits to the destination file.
You would use this function if you are uncertain of the destination path format or if you'd like to copy the permission bits of the source file.
shutil.copy2(src, dst)
This is the same as the copy function we used except it copies the file metadata with the file. The metadata includes the permission bits, last access time, last modification time, and flags.
You would use this function over copy
if you want an almost exact duplicate of the file.
Comparison of Python File Copying Functions
Below we can see a comparison of shutil
's file copying functions, and how they differ.
Function | Copies Metadata | Copies Permissions | Can Specify Buffer |
shutil.copy |
No | Yes | No |
shutil.copyfile |
No | No | No |
shutil.copy2 |
Yes | Yes | No |
shutil.copyfileobj |
No | No | Yes |
That's pretty much the end of copying files. I hope you benefited from this article, and I hope it was worth your time to learn a little bit of file operations in Python.
Until the next time!
Ciao!