Pathlib Python Module

Learn via video course
FREE
View all courses
Python Course for Beginners With Certification: Mastering the Essentials
Python Course for Beginners With Certification: Mastering the Essentials
by Rahul Janghu
1000
4.90
Start Learning
Python Course for Beginners With Certification: Mastering the Essentials
Python Course for Beginners With Certification: Mastering the Essentials
by Rahul Janghu
1000
4.90
Start Learning
Topics Covered

Overview

pathlib is one of Python’s standard utility modules which makes working with file paths very easy and efficient by representing file system paths with semantics suitable for different operating systems. It provides various classes having functionalities and operations to help you save time while handling and manipulating paths. This ensures that file paths work the same across different operating systems, as each operating system has different rules for creating file paths. This article focuses on demonstrating the pathlib Python module and various methods of its paths.

Pathlib Python Module

As we know that each operating system has different rules for constructing file paths. For example, Linux uses forward slashes to distinguish between names in paths, while Windows uses backslashes for the same. So, even this slight difference in semantics can cause problems when you are working and collaborating with people using different operating systems.

Luckily, Python takes care of this. If you're programming in Python, the pathlib module does the work, ensuring that file paths work the same across different operating systems. The pathlib module in Python provides you with various classes that ensure useful functions and operations for handling paths.

NOTE: pathlib comes by default for the Python versions greater than or equal to 3.4. But, if you're using a Python version prior to 3.4, then you will not be able to use this module.

There is a similar module in Python that does the same job, the os.path module. But, the difference between the two is that path os. path module creates strings that represent file paths whereas pathlib creates path objects. The advantage of using a path object instead of a string is that you can call methods on the path object. Also, the pathlib module allows you to perform these operations with a high degree of readability and a minimal amount of code as compared to other modules doing the same functionality.

To get the desired path of a file, you just need to import the module into your program and you can use pathlib's Path object with the Python built-in variable file to refer to the desired file path. Let's look at a simple example to understand this:

Here we are placing a Python file named file.py in a specific directory. I opened the file and entered the above content. Now the pathlib module has been imported and a local variable named obj has been created to store the string object of path. Here, we're using pathlib's Path object with a Python built-in variable called file to reference the file path we're going to write to file.py. Now, printing obj gives the path to the current file. This is because by placing this specific script in the Path object, pathlib creates a path to that file.

pathlib splits file system paths into two different types of classes. These classes represent two different types of path object classes: pure and concrete paths. Subsequent sections of this article describe each type of path object in detail.

Pure Paths

Pure Paths provides pure computational operations and utilities for processing and editing file paths without performing write operations. It will edit a file path on your machine even if they belong to different operating systems.

For example, let's say you are on Linux and you want to use a Windows file path. Here, a pure path class object helps you get a working path on your computer using basic operations like creating subpaths and accessing individual parts of the path. Purepath objects work without actually accessing the file system. It is just useful when we want to manipulate a path without actually accessing the operating system. So you can easily manipulate Windows file system paths in Linux and vice versa using pure class objects.

Note: Pure paths cannot mimic other operations such as creating directories or files because you are not using this operating system.

Pure Paths Clases

Pure paths are divided into three classes as shown in the diagram above. These classes can operate on any file path for system:

  1. pathlib.PurePath class
  2. pathlib.PurePosixPath class
  3. pathlib.PureWindowsPath class

We will see each of these classes in the below sections:

Class pathlib.PurePath(*pathsegments)

The pathlib.PurePath class represents a generic class and creates either pathlib.PurePosixPath or pathlib.PureWindowsPath when instantiating this class. Therefore, these two classes are created to operate on Windows paths and Linux machine paths. So if you're working on a Windows operating system, PureWindowsPath() is called when you instantiate the pathlib.PurePath class, and if you're working on Linux, you get the pathlib.PurePosixPath class.

We can say that PurePosixPath() is a child node of PurePath() class executed for Linux or Unix file paths and PureWindowsPath() is a child node of PurePath() class executed for Windows file system paths. Here, PurePath() is the root node that provides processing operations for each path object in the pathlib.

The output of the above code is:

The PurePosixPath() instance of PurePath class is called as the forward slashes are used for paths which means that it's implemented for the Linux machine.

The output of the above code is:

The PureWindowsPath() instance of PurePath class is called as the backward slashes are used for paths which means that it's implemented for the windows machine.

PurePath Properties

There are some of the properties of the Pure path class that we can use to get the desired operation done. Let's see a few of them.

  1. PurePath().parent property:

This property returns the parent of the path.

The output of the above code is:

So, by using the .parent property of the Pure path class we will get the path of the logical parent of the file.py file.

  1. PurePath().name property:

This property returns the name of the last component of the path.

The output of the above code is:

As you can see from the file path above, the last path component is file.py. So, the .name property gives you the main file name, like in this case the file name is file.py with its file extension as .py.

  1. PurePath().suffix property:

This property gives the file extension (like .py for Python file) of the last component (which consists of file name + the file extension) of the path after removing the file name.

The output of the above code is:

As you can see from the output above, the .suffix property outputs the file extension and excludes the file name.

  1. PurePath().stem property:

This property returns the name of the last component of the path after removing the suffix.

The output of the above code is:

As you can see from the output above, the .stem property removes the file extension, (like .py in this case) of the last component file.py and only returns the file name of the path.

Class pathlib.PurePosixPath(*pathsegments)

Class pathlib.PurePosixPath is a child class of PurePath class and as we know it is the child node of PurePath() implemented for non-Windows file system paths like Linux/Unix machines.

Let's see the example of the PurePosixpath():

The output of the above piece of code is:

Class pathlib.PureWindowsPath(*pathsegments)

This is also a child class of PurePath class and it is implemented for the Windows file system paths.

Let's see an example of this:

The output of the above piece of code is:

PurePath.is_absolute() Method

Each subclass of the PurePath() class provides some of the methods like .is_absolute(), .is_related() and so on. The .is_absolute() method is used to check whether your path is absolute or not. This method returns the boolean value (true or false). The true value would be returned if the path is absolute otherwise the false value would be returned.

Let's see one example of this method:

The returned value would be the verification that the path provided is absolute:

Concrete Paths

A concrete path is another type of path object. It is a subclass of the pure path class because it inherits operations from its parent class and adds input/output operations that perform system calls. As we know, a pure path operates only on a file path without performing a write operation, but in the case of a concrete path, it is possible to operate while performing a write operation on the file path.

Concrete paths consist of three classes that handle any file system path on your machine:

  1. pathlib.Path class
  2. pathlib.PosixPath class
  3. pathlib.WindowsPath class

We will see each of these classes in the below sections:

Class pathlib.Path(*pathsegments)

The pathlib.Path class represents the generic class and upon instantiating this class will create either the pathlib.PosixPath or pathlib.WindowsPath. So, it creates these two classes to operate on Windows paths and non-Windows paths. So, when you are working on the windows operating system, WindowsPath() will be called on instantiating the pathlib. Path class and when you are working on Linux, you will get the pathlib.PosixPath class.

We can say that PosixPath() is the child node of Path() implemented for non-Windows file system paths such as Linux/Unix machines and WindowsPath() is the child node of Path() implemented for Windows file system paths. Here, Path() is the root node that provides handling operations to every path object in Pathlib.

The output of the above code is:

The PosixPath() instance of the Path class is called as the forward slashes are used for paths which means that is implemented for the Linux machine.

The output of the above code is:

The WindowsPath() instance of the Path class is called as the backward slashes are used for paths which means that it is implemented for the windows machine.

Class pathlib.PosixPath(*pathsegments)

Class pathlib.PosixPath is a child class of the Path class and as we know it is the child node of Path() implemented for non-Windows file system paths like Linux/Unix machines.

Let's see the example of the Posixpath():

The output of the above piece of code is:

Class pathlib.WindowsPath(*pathsegments)

This is also a child class of the Path class and it is implemented for the Windows file system paths.

Let's see an example of this:

The output of the above piece of code is:

Few Other Methods Provided by Class Paths

Let's see a few of the other common methods provided by the Path class:

Path.cwd() Method

The Path.cwd() method returns a new path object representing the current working directory of the file.

Let's see an example of this:

The output of the above code is:

This represents the current working directory.

Path.exists() Method

Path().exists() method is used to verify whether the path which is provided is pointing to an existing file or directory or not. This is majorly used to check if the file or a directory exists or not on the path provided by the user. Let's understand this with the help of an example:

The output of the above code would be a boolean entry stating if the file or a directory exists or not.

Path.is_dir() Method

As the name suggests, this method is used to check if the given path is a directory or not. Let's see this with the help of an example:

The output of the above code would be a boolean entry confirming if the given path refers to a directory or not.

Conclusion

  • pathlib is one of Python’s standard utility modules which makes it very easy and efficient to deal with file paths.
  • It provides various classes which ensured that the functionalities and operations will help you in handling paths.
  • pathlib divides the filesystem paths into two different classes that represent two different types of path objects: Pure Path and Concrete Path.
  • Pure paths provide purely computational operations and utilities to handle and manipulate your file path without making writing operations.
  • The pure paths consist of three classes that handle any file system path on your machine: pathlib.PurePath class, pathlib.PurePosixPath class, pathlib.PureWindowsPath class.
  • The pathlib.PurePath class represents the generic class and upon instantiating this class will create either the pathlib.PurePosixPath or pathlib.PureWindowsPath.
  • PurePosixPath() is the child node of PurePath() implemented for non-Windows file system paths like linux/unix machines whereas PureWindowsPath() is the child node of PurePath() implemented for Windows file system paths.
  • The Path.cwd() method returns a new path object which represents the current working directory of the file.
  • Path().exists() method is used to verify whether the path which is provided is pointing to an existing file or directory or not.
  • Path.is_dir() method is used to check if the given path is a directory or not.
  • Concrete paths inherent manipulation from the parent class and add input/output operations that do system calls.
  • Pure path only manipulates the file path without making write operations, while concrete path allows you to do the manipulation along with performing the write operations on your file path.
  • Concrete paths consist of three classes that handle any file system path on your machine: pathlib.Path class, pathlib.PosixPath class, pathlib.WindowsPath class.
  • The pathlib. Path class represents the generic class and upon instantiating this class will create either the pathlib.PosixPath or pathlib.WindowsPath.
  • PosixPath() is the child node of Path() implemented for non-Windows file system paths like linux/unix machines while WindowsPath() is the child node of Path() implemented for Windows file system paths.