.. _pyporting-howto:

*********************************
Porting Python 2 Code to Python 3
*********************************

:author: Brett Cannon

.. topic:: Abstract

   With Python 3 being the future of Python while Python 2 is still in active
   use, it is good to have your project available for both major releases of
   Python. This guide is meant to help you choose which strategy works best
   for your project to support both Python 2 & 3 along with how to execute
   that strategy.

   If you are looking to port an extension module instead of pure Python code,
   please see :ref:`cporting-howto`.


Choosing a Strategy
===================

When a project makes the decision that it's time to support both Python 2 & 3,
a decision needs to be made as to how to go about accomplishing that goal.
The chosen strategy will depend on how large the project's existing
codebase is and how much divergence you want from your Python 2 codebase from
your Python 3 one (e.g., starting a new version with Python 3).

If your project is brand-new or does not have a large codebase, then you may
want to consider writing/porting :ref:`all of your code for Python 3
and use 3to2 <use_3to2>` to port your code for Python 2.

If you would prefer to maintain a codebase which is semantically **and**
syntactically compatible with Python 2 & 3 simultaneously, you can write
:ref:`use_same_source`. While this tends to lead to somewhat non-idiomatic
code, it does mean you keep a rapid development process for you, the developer.

Finally, you do have the option of :ref:`using 2to3 <use_2to3>` to translate
Python 2 code into Python 3 code (with some manual help). This can take the
form of branching your code and using 2to3 to start a Python 3 branch. You can
also have users perform the translation at installation time automatically so
that you only have to maintain a Python 2 codebase.

Regardless of which approach you choose, porting is not as hard or
time-consuming as you might initially think. You can also tackle the problem
piece-meal as a good portion of porting is simply updating your code to follow
current best practices in a Python 2/3 compatible way.


Universal Bits of Advice
------------------------

Regardless of what strategy you pick, there are a few things you should
consider.

One is make sure you have a robust test suite. You need to make sure everything
continues to work, just like when you support a new minor version of Python.
This means making sure your test suite is thorough and is ported properly
between Python 2 & 3. You will also most likely want to use something like tox_
to automate testing between both a Python 2 and Python 3 VM.

Two, once your project has Python 3 support, make sure to add the proper
classifier on the Cheeseshop_ (PyPI_). To have your project listed as Python 3
compatible it must have the
`Python 3 classifier <http://pypi.python.org/pypi?:action=browse&c=533>`_
(from
http://techspot.zzzeek.org/2011/01/24/zzzeek-s-guide-to-python-3-porting/)::

   setup(
     name='Your Library',
     version='1.0',
     classifiers=[
         # make sure to use :: Python *and* :: Python :: 3 so
         # that pypi can list the package on the python 3 page
         'Programming Language :: Python',
         'Programming Language :: Python :: 3'
     ],
     packages=['yourlibrary'],
     # make sure to add custom_fixers to the MANIFEST.in
     include_package_data=True,
     # ...
   )


Doing so will cause your project to show up in the
`Python 3 packages list
<http://pypi.python.org/pypi?:action=browse&c=533&show=all>`_. You will know
you set the classifier properly as visiting your project page on the Cheeseshop
will show a Python 3 logo in the upper-left corner of the page.

Three, the six_ project provides a library which helps iron out differences
between Python 2 & 3. If you find there is a sticky point that is a continual
point of contention in your translation or maintenance of code, consider using
a source-compatible solution relying on six. If you have to create your own
Python 2/3 compatible solution, you can use ``sys.version_info[0] >= 3`` as a
guard.

Four, read all the approaches. Just because some bit of advice applies to one
approach more than another doesn't mean that some advice doesn't apply to other
strategies.

Five, drop support for older Python versions if possible. `Python 2.5`_
introduced a lot of useful syntax and libraries which have become idiomatic
in Python 3. `Python 2.6`_ introduced future statements which makes
compatibility much easier if you are going from Python 2 to 3.
`Python 2.7`_ continues the trend in the stdlib. So choose the newest version
of Python which you believe can be your minimum support version
and work from there.


.. _tox: http://codespeak.net/tox/
.. _Cheeseshop:
.. _PyPI: http://pypi.python.org/
.. _six: http://packages.python.org/six
.. _Python 2.7: http://www.python.org/2.7.x
.. _Python 2.6: http://www.python.org/2.6.x
.. _Python 2.5: http://www.python.org/2.5.x
.. _Python 2.4: http://www.python.org/2.4.x
.. _Python 2.3: http://www.python.org/2.3.x
.. _Python 2.2: http://www.python.org/2.2.x


.. _use_3to2:

Python 3 and 3to2
=================

If you are starting a new project or your codebase is small enough, you may
want to consider writing your code for Python 3 and backporting to Python 2
using 3to2_. Thanks to Python 3 being more strict about things than Python 2
(e.g., bytes vs. strings), the source translation can be easier and more
straightforward than from Python 2 to 3. Plus it gives you more direct
experience developing in Python 3 which, since it is the future of Python, is a
good thing long-term.

A drawback of this approach is that 3to2 is a third-party project. This means
that the Python core developers (and thus this guide) can make no promises
about how well 3to2 works at any time. There is nothing to suggest, though,
that 3to2 is not a high-quality project.


.. _3to2: https://bitbucket.org/amentajo/lib3to2/overview


.. _use_2to3:

Python 2 and 2to3
=================

Included with Python since 2.6, the 2to3_ tool (and :mod:`lib2to3` module)
helps with porting Python 2 to Python 3 by performing various source
translations. This is a perfect solution for projects which wish to branch
their Python 3 code from their Python 2 codebase and maintain them as
independent codebases. You can even begin preparing to use this approach
today by writing future-compatible Python code which works cleanly in
Python 2 in conjunction with 2to3; all steps outlined below will work
with Python 2 code up to the point when the actual use of 2to3 occurs.

Use of 2to3 as an on-demand translation step at install time is also possible,
preventing the need to maintain a separate Python 3 codebase, but this approach
does come with some drawbacks. While users will only have to pay the
translation cost once at installation, you as a developer will need to pay the
cost regularly during development. If your codebase is sufficiently large
enough then the translation step ends up acting like a compilation step,
robbing you of the rapid development process you are used to with Python.
Obviously the time required to translate a project will vary, so do an
experimental translation just to see how long it takes to evaluate whether you
prefer this approach compared to using :ref:`use_same_source` or simply keeping
a separate Python 3 codebase.

Below are the typical steps taken by a project which uses a 2to3-based approach
to supporting Python 2 & 3.


Support Python 2.7
------------------

As a first step, make sure that your project is compatible with `Python 2.7`_.
This is just good to do as Python 2.7 is the last release of Python 2 and thus
will be used for a rather long time. It also allows for use of the ``-3`` flag
to Python to help discover places in your code which 2to3 cannot handle but are
known to cause issues.

Try to Support `Python 2.6`_ and Newer Only
-------------------------------------------

While not possible for all projects, if you can support `Python 2.6`_ and newer
**only**, your life will be much easier. Various future statements, stdlib
additions, etc. exist only in Python 2.6 and later which greatly assist in
porting to Python 3. But if you project must keep support for `Python 2.5`_ (or
even `Python 2.4`_) then it is still possible to port to Python 3.

Below are the benefits you gain if you only have to support Python 2.6 and
newer. Some of these options are personal choice while others are
**strongly** recommended (the ones that are more for personal choice are
labeled as such).  If you continue to support older versions of Python then you
at least need to watch out for situations that these solutions fix.


``from __future__ import print_function``
'''''''''''''''''''''''''''''''''''''''''

This is a personal choice. 2to3 handles the translation from the print
statement to the print function rather well so this is an optional step. This
future statement does help, though, with getting used to typing
``print('Hello, World')`` instead of ``print 'Hello, World'``.


``from __future__ import unicode_literals``
'''''''''''''''''''''''''''''''''''''''''''

Another personal choice. You can always mark what you want to be a (unicode)
string with a ``u`` prefix to get the same effect. But regardless of whether
you use this future statement or not, you **must** make sure you know exactly
which Python 2 strings you want to be bytes, and which are to be strings. This
means you should, **at minimum** mark all strings that are meant to be text
strings with a ``u`` prefix if you do not use this future statement.


Bytes literals
''''''''''''''

This is a **very** important one. The ability to prefix Python 2 strings that
are meant to contain bytes with a ``b`` prefix help to very clearly delineate
what is and is not a Python 3 string. When you run 2to3 on code, all Python 2
strings become Python 3 strings **unless** they are prefixed with ``b``.

There are some differences between byte literals in Python 2 and those in
Python 3 thanks to the bytes type just being an alias to ``str`` in Python 2.
Probably the biggest "gotcha" is that indexing results in different values. In
Python 2, the value of ``b'py'[1]`` is ``'y'``, while in Python 3 it's ``121``.
You can avoid this disparity by always slicing at the size of a single element:
``b'py'[1:2]`` is ``'y'`` in Python 2 and ``b'y'`` in Python 3 (i.e., close
enough).

You cannot concatenate bytes and strings in Python 3. But since Python
2 has bytes aliased to ``str``, it will succeed: ``b'a' + u'b'`` works in
Python 2, but ``b'a' + 'b'`` in Python 3 is a :exc:`TypeError`. A similar issue
also comes about when doing comparisons between bytes and strings.


Supporting `Python 2.5`_ and Newer Only
---------------------------------------

If you are s Newer Only
----------------roject has Python 3 support, make sure to add the proper
classifier on the Cheeseshop_ (PyPI_). To have your project listed as Python 3
compatible it must have the
`Python 3 classifier <http://pypi.python.org/pypi?:action=browse&c=533>`_
(from
http://techspot.zzzeek.org/2011/01/24/zzzeek-s-guide-to-python-3-porting/)::

   setup(
     name='Your Library',
     version='1.0',
     classifiers=[
         # make sure to use :: Python *and* :: Python :: 3 so
         # that pypi can list the package on the python 3 page
         'Programming Language :: Python',
         'Programming Language :: Python :: 3'
     ],
     packages=['yourlibrary'],
     # make sure to add custom_fixers to the MANIFEST.in
     include_package_data=True,
     # ...
   )


Doing so will cause your project to show up in the
`Python 3 packages list
<http://pypi.python.org/pypi?:action=browse&c=533&show=all>`_. You will know
you set the classifier properly as visiting your project page on the Cheeseshop
will show a Python 3 logo in the upper-left corner of the page.

Three, the six_ project provides a library which helps iron out differences
between Python 2 & 3. If you find there is a sticky point that is a continual
point of contention in your translation or maintenance of code, consider using
a source-compatible solution relying on six. If you have to create your own
Python 2/3 compatible solution, you can use ``sys.version_info[0] >= 3`` as a
guard.

Four, read all the approaches. Just because some bit of advice applies to one
approach more than another doesn't mean that some advice doesn't apply to other
strategies.

Five, drop support for older Python versions if possible. `Python 2.5`_
introduced a lot of useful syntax and libraries which have become idiomatic
in Python 3. `Python 2.6`_ introduced future statements which makes
compatibility much easier if you are going from Python 2 to 3.
`Python 2.7`_ continues the trend in the stdlib. So choose the newest version
of Python which you believe can be your minimum support version
and work from there.


.. _tox: http://codespeak.net/tox/
.. _Cheeseshop:
.. _PyPI: http://pypi.python.org/
.. _six: http://packages.python.org/six
.. _Python 2.7: http://www.python.org/2.7.x
.. _Python 2.6: http://www.python.org/2.6.x
.. _Python 2.5: http://www.python.org/2.5.x
.. _Python 2.4: http://www.python.org/2.4.x
.. _Python 2.3: http://www.python.org/2.3.x
.. _Python 2.2: http://www.python.org/2.2.x


.. _use_3to2:

Python 3 and 3to2
=================

If you are starting a new project or your codebase is small enough, you may
want to consider writing your code for Python 3 and backporting to Python 2
using 3to2_. Thanks to Python 3 being more strict about things than Python 2
(e.g., bytes vs. strings), the source translation can be easier and more
straightforward than from Python 2 to 3. Plus it gives you more direct
experience developing in Python 3 which, since it is the future of Python, is a
good thing long-term.

A drawback of this approach is that 3to2 is a third-party project. This means
that the Python core developers (and thus this guide) can make no promises
about how well 3to2 works at any time. There is nothing to suggest, though,
that 3to2 is not a high-quality project.


.. _3to2: https://bitbucket.org/amentajo/lib3to2/overview


.. _use_2to3:

Python 2 and 2to3
=================

Included with Python since 2.6, the 2to3_ tool (and :mod:`lib2to3` module)
helps with porting Python 2 to Python 3 by performing various source
translations. This is a perfect solution for projects which wish to branch
their Python 3 code from their Python 2 codebase and maintain them as
independent codebases. You can even begin preparing to use this approach
today by writing future-compatible Python code which works cleanly in
Python 2 in conjunction with 2to3; all steps outlined below will work
with Python 2 code up to the point when the actual use of 2to3 occurs.

Use of 2to3 as an on-demand translation step at install time is also possible,
preventing the need to maintain a separate Python 3 codebase, but this approach
does come with some drawbacks. While users will only have to pay the
translation cost once at installation, you as a developer will need to pay the
cost regularly during development. If your codebase is sufficiently large
enough then the translation step ends up acting like a compilation step,
robbing you of the rapid development process you are used to with Python.
Obviously the time required to translate a project will vary, so do an
experimental translation just to see how long it takes to evaluate whether you
prefer this approach compared to using :ref:`use_same_source` or simply keeping
a separate Python 3 codebase.

Below are the typical steps taken by a project which uses a 2to3-based approach
to supporting Python 2 & 3.


Support Python 2.7
------------------

As a first step, make sure that your project is compatible with `Python 2.7`_.
This is just good to do as Python 2.7 is the last release of Python 2 and thus
will be used for a rather long time. It also allows for use of the ``-3`` flag
to Python to help discover places in your code which 2to3 cannot handle but are
known to cause issues.

Try to Support `Python 2.6`_ and Newer Only
-------------------------------------------

While not possible for all projects, if you can support `Python 2.6`_ and newer
**only**, your life will be much easier. Various future statements, stdlib
additions, etc. exist only in Python 2.6 and later which greatly assist in
porting to Python 3. But if you project must keep support for `Python 2.5`_ (or
even `Python 2.4`_) then it is still possible to port to Python 3.

Below are the benefits you gain if you only have to support Python 2.6 and
newer. Some of these options are personal choice while others are
**strongly** recommended (the ones that are more for personal choice are
labeled as such).  If you continue to support older versions of Python then you
at least need to watch out for situations that these solutions fix.


``from __future__ import print_function``
'''''''''''''''''''''''''''''''''''''''''

This is a personal choice. 2to3 handles the translation from the print
statement to the print function rather well so this is an optional step. This
future statement does help, though, with getting used to typing
``print('Hello, World')`` instead of ``print 'Hello, World'``.


``from __future__ import unicode_literals``
'''''''''''''''''''''''''''''''''''''''''''

Another personal choice. You can always mark what you want to be a (unicode)
string with a ``u`` prefix to get the same effect. But regardless of whether
you use this future statement or not, you **must** make sure you know exactly
which Python 2 strings you want to be bytes, and which are to be strings. This
means you should, **at minimum** mark all strings that are meant to be text
strings with a ``u`` prefix if you do not use this future statement.


Bytes literals
''''''''''''''

This is a **very** important one. The ability to prefix Python 2 strings that
are meant to contain bytes with a ``b`` prefix help to very clearly delineate
what is and is not a Python 3 string. When you run 2to3 on code, all Python 2
strings become Python 3 strings **unless** they are prefixed with ``b``.

There are some differences between byte literals in Python 2 and those in
Python 3 thanks to the bytes type just being an alias to ``str`` in Python 2.
Probably the biggest "gotcha" is that indexing results in different values. In
Python 2, the value of ``b'py'[1]`` is ``'y'``, while in Python 3 it's ``121``.
You can avoid this disparity by always slicing at the size of a single element:
``b'py'[1:2]`` is ``'y'`` in Python 2 and ``b'y'`` in Python 3 (i.e., close
enough).

You cannot concatenate bytes and strings in Python 3. But since Python
2 has bytes aliased to ``str``, it will succeed: ``b'a' + u'b'`` works in
Python 2, but ``b'a' + 'b'`` in Python 3 is a :exc:`TypeError`. A similar issue
also comes about when doing comparisons between bytes and strings.


Supporting `Python 2.5`_ and Newer Only
---------------------------------------

If you are s Newer Only
----------------roject has Python 3 support, make sure to add the proper
classifier on the Cheeseshop_ (PyPI_). To have your project listed as Python 3
compatible it must have the
`Python 3 classifier <http://pypi.python.org/pypi?:action=browse&c=533>`_
(from
http://techspot.zzzeek.org/2011/01/24/zzzeek-s-guide-to-python-3-porting/)::

   setup(
     name='Your Library',
     version='1.0',
     classifiers=[
         # make sure to use :: Python *and* :: Python :: 3 so
         # that pypi can list the package on the python 3 page
         'Programming Language :: Python',
         'Programming Language :: Python :: 3'
     ],
     packages=['yourlibrary'],
     # make sure to add custom_fixers to the MANIFEST.in
     include_package_data=True,
     # ...
   )


Doing so will cause your project to show up in the
`Python 3 packages list
<http://pypi.python.org/pypi?:action=browse&c=533&show=all>`_. You will know
you set the classifier properly as visiting your project page on the Cheeseshop
will show a Python 3 logo in the upper-left corner of the page.

Three, the six_ project provides a library which helps iron out differences
between Python 2 & 3. If you find there is a sticky point that is a continual
point of contention in your translation or maintenance of code, consider using
a source-compatible solution relying on six. If you have to create your own
Python 2/3 compatible solution, you can use ``sys.version_info[0] >= 3`` as a
guard.

Four, read all the approaches. Just because some bit of advice applies to one
approach more than another doesn't mean that some advice doesn't apply to other
strategies.

Five, drop support for older Python versions if possible. `Python 2.5`_
introduced a lot of useful syntax and libraries which have become idiomatic
in Python 3. `Python 2.6`_ introduced future statements which makes
compatibility much easier if you are going from Python 2 to 3.
`Python 2.7`_ continues the trend in the stdlib. So choose the newest version
of Python which you believe can be your minimum support version
and work from there.


.. _tox: http://codespeak.net/tox/
.. _Cheeseshop:
.. _PyPI: http://pypi.python.org/
.. _six: http://packages.python.org/six
.. _Python 2.7: http://www.python.org/2.7.x
.. _Python 2.6: http://www.python.org/2.6.x
.. _Python 2.5: http://www.python.org/2.5.x
.. _Python 2.4: http://www.python.org/2.4.x
.. _Python 2.3: http://www.python.org/2.3.x
.. _Python 2.2: http://www.python.org/2.2.x


.. _use_3to2:

Python 3 and 3to2
=================

If you are starting a new project or your codebase is small enough, you may
want to consider writing your code for Python 3 and backporting to Python 2
using 3to2_. Thanks to Python 3 being more strict about things than Python 2
(e.g., bytes vs. strings), the source translation can be easier and more
straightforward than from Python 2 to 3. Plus it gives you more direct
experience developing in Python 3 which, since it is the future of Python, is a
good thing long-term.

A drawback of this approach is that 3to2 is a third-party project. This means
that the Python core developers (and thus this guide) can make no promises
about how well 3to2 works at any time. There is nothing to suggest, though,
that 3to2 is not a high-quality project.


.. _3to2: https://bitbucket.org/amentajo/lib3to2/overview


.. _use_2to3:

Python 2 and 2to3
=================

Included with Python since 2.6, the 2to3_ tool (and :mod:`lib2to3` module)
helps with porting Python 2 to Python 3 by performing various source
translations. This is a perfect solution for projects which wish to branch
their Python 3 code from their Python 2 codebase and maintain them as
independent codebases. You can even begin preparing to use this approach
today by writing future-compatible Python code which works cleanly in
Python 2 in conjunction with 2to3; all steps outlined below will work
with Python 2 code up to the point when the actual use of 2to3 occurs.

Use of 2to3 as an on-demand translation step at install time is also possible,
preventing the need to maintain a separate Python 3 codebase, but this approach
does come with some drawbacks. While users will only have to pay the
translation cost once at installation, you as a developer will need to pay the
cost regularly during development. If your codebase is sufficiently large
enough then the translation step ends up acting like a compilation step,
robbing you of the rapid development process you are used to with Python.
Obviously the time required to translate a project will vary, so do an
experimental translation just to see how long it takes to evaluate whether you
prefer this approach compared to using :ref:`use_same_source` or simply keeping
a separate Python 3 codebase.

Below are the typical steps taken by a project which uses a 2to3-based approach
to supporting Python 2 & 3.


Support Python 2.7
------------------

As a first step, make sure that your project is compatible with `Python 2.7`_.
This is just good to do as Python 2.7 is the last release of Python 2 and thus
will be used for a rather long time. It also allows for use of the ``-3`` flag
to Python to help discover places in your code which 2to3 cannot handle but are
known to cause issues.

Try to Support `Python 2.6`_ and Newer Only
-------------------------------------------

While not possible for all projects, if you can support `Python 2.6`_ and newer
**only**, your life will be much easier. Various future statements, stdlib
additions, etc. exist only in Python 2.6 and later which greatly assist in
porting to Python 3. But if you project must keep support for `Python 2.5`_ (or
even `Python 2.4`_) then it is still possible to port to Python 3.

Below are the benefits you gain if you only have to support Python 2.6 and
newer. Some of these options are personal choice while others are
**strongly** recommended (the ones that are more for personal choice are
labeled as such).  If you continue to support older versions of Python then you
at least need to watch out for situations that these solutions fix.


``from __future__ import print_function``
'''''''''''''''''''''''''''''''''''''''''

This is a personal choice. 2to3 handles the translation from the print
statement to the print function rather well so this is an optional step. This
future statement does help, though, with getting used to typing
``print('Hello, World')`` instead of ``print 'Hello, World'``.


``from __future__ import unicode_literals``
'''''''''''''''''''''''''''''''''''''''''''

Another personal choice. You can always mark what you want to be a (unicode)
string with a ``u`` prefix to get the same effect. But regardless of whether
you use this future statement or not, you **must** make sure you know exactly
which Python 2 strings you want to be bytes, and which are to be strings. This
means you should, **at minimum** mark all strings that are meant to be text
strings with a ``u`` prefix if you do not use this future statement.


Bytes literals
''''''''''''''

This is a **very** important one. The ability to prefix Python 2 strings that
are meant to contain bytes with a ``b`` prefix help to very clearly delineate
what is and is not a Python 3 string. When you run 2to3 on code, all Python 2
strings become Python 3 strings **unless** they are prefixed with ``b``.

There are some differences between byte literals in Python 2 and those in
Python 3 thanks to the bytes type just being an alias to ``str`` in Python 2.
Probably the biggest "gotcha" is that indexing results in different values. In
Python 2, the value of ``b'py'[1]`` is ``'y'``, while in Python 3 it's ``121``.
You can avoid this disparity by always slicing at the size of a single element:
``b'py'[1:2]`` is ``'y'`` in Python 2 and ``b'y'`` in Python 3 (i.e., close
enough).

You cannot concatenate bytes and strings in Python 3. But since Python
2 has bytes aliased to ``str``, it will succeed: ``b'a' + u'b'`` works in
Python 2, but ``b'a' + 'b'`` in Python 3 is a :exc:`TypeError`. A similar issue
also comes about when doing comparisons between bytes and strings.


Supporting `Python 2.5`_ and Newer Only
---------------------------------------

If you are s Newer Only
----------------roject has Python 3 support, make sure to add the proper
classifier on the Cheeseshop_ (PyPI_). To have your project listed as Python 3
compatible it must have the
`Python 3 classifier <http://pypi.python.org/pypi?:action=br