Typeerror must be str not bytes

Содержание

I was looking for a way to run an external process from python script and print its stdout messages during the execution.
The code below works, but prints no stdout output during runtime. When it exits I am getting the following error:

sys.stdout.write(nextline) TypeError:must be str,not bytes

I am using python 3.3.2

2 Answers 2

Python 3 handles strings a bit different. Originally there was just one type for strings: str . When unicode gained traction in the ’90s the new unicode type was added to handle Unicode without breaking pre-existing code 1 . This is effectively the same as str but with multibyte support.

In Python 3 there are two different types:

The bytes type. This is just a sequence of bytes, Python doesn’t know anything about how to interpret this as characters.
The str type. This is also a sequence of bytes, but Python knows how to interpret those bytes as characters.
The separate unicode type was dropped. str now supports unicode.

In Python 2 implicitly assuming an encoding could cause a lot of problems; you could end up using the wrong encoding, or the data may not have an encoding at all (e.g. it’s a PNG image).
Explicitly telling Python which encoding to use (or explicitly telling it to guess) is often a lot better and much more in line with the "Python philosophy" of "explicit is better than implicit".

This change is incompatible with Python 2 as many return values have changed, leading to subtle problems like this one; it’s probably the main reason why Python 3 adoption has been so slow. Since Python doesn’t have static typing 2 it’s impossible to change this automatically with a script (such as the bundled 2to3 ).

You can convert str to bytes with bytes(‘h€llo’, ‘utf-8′) ; this should produce b’Hxe2x82xacllo’ . Note how one character was converted to three bytes.
You can convert bytes to str with b’Hxe2x82xacllo’.decode(‘utf-8’) .

Of course, UTF-8 may not be the correct character set in your case, so be sure to use the correct one.

In your specific piece of code, nextline is of type bytes , not str , reading stdout and stdin from subprocess changed in Python 3 from str to bytes . This is because Python can’t be sure which encoding this uses. It probably uses the same as sys.stdin.encoding (the encoding of your system), but it can’t be sure.

You need to replace:

You will also need to modify if nextline == » to if nextline == b» since:

1 There are some neat tricks you can do with ASCII that you can’t do with multibyte character sets; the most famous example is the "xor with space to switch case" (e.g. chr(ord(‘a’) ^ ord(‘ ‘)) == ‘A’ ) and "set 6th bit to make a control character" (e.g. ord(‘ ‘) + ord(‘@’) == ord(‘I’) ). ASCII was designed in a time when manipulating individual bits was an operation with a non-negligible performance impact.

2 Yes, you can use function annotations, but it’s a comparatively new feature and little used.

Я искал способ запустить внешний процесс из python script и распечатать его сообщения stdout во время выполнения.
Приведенный ниже код работает, но не выводит вывод stdout во время выполнения. Когда он выходит, я получаю следующую ошибку:

sys.stdout.write(nextline) TypeError: должен быть str, а не байтами

Я использую python 3.3.2

Python 3 обрабатывает строки немного иначе. Первоначально был только один тип для строки: str . Когда юникод приобрел сцепление в 90-х годах, новый тип unicode был добавлен для обработки Unicode без нарушения существующего кода 1 . Это эффективно, так же как str , но с поддержкой многобайтов.

В Python 3 существуют два разных типа:

Тип bytes . Это всего лишь последовательность байтов, Python не знает что-то о том, как интерпретировать это как символы.
Тип str . Это также последовательность байтов, но Python знает, как интерпретировать эти байты как символы.
Отключен отдельный тип unicode . str теперь поддерживает unicode.

В Python 2 неявно предполагается, что кодирование может вызвать множество проблем; вы может закончиться неправильным кодированием, или данные могут не иметь кодировки в все (например, изображение PNG).
Явно говорю Python, какую кодировку использовать (или явно сообщая догадка) часто намного лучше и намного больше соответствует "философии Питона", из явного лучше, чем неявное.

Это изменение несовместимо с Python 2, поскольку многие возвращаемые значения изменились, приводя к таким тонким проблемам, как этот; это, вероятно, главная причина, почему Принятие Python 3 было настолько медленным. Поскольку Python не имеет статического ввода 2 невозможно автоматически изменить это с помощью script (например, в комплекте 2to3 ).

Вы можете преобразовать str в bytes с помощью bytes(‘h€llo’, ‘utf-8′) ; это должно произведите b’Hxe2x82xacllo’ . Обратите внимание, как один символ был преобразован в три байтов.
Вы можете преобразовать bytes в str с помощью b’Hxe2x82xacllo’.decode(‘utf-8’) .

Конечно, UTF-8 может быть неправильным набором символов в вашем случае, поэтому обязательно для использования правильного.

В вашем конкретном фрагменте кода nextline имеет тип bytes , а не str , чтение stdout и stdin из subprocess изменено в Python 3 с str на bytes . Это связано с тем, что Python не может быть уверен, какую кодировку он использует. Это вероятно, использует то же, что и sys.stdin.encoding (кодирование вашей системы), но он не может быть уверен.

Вам нужно заменить:

Читайте также: Как обновить модем yota

Вам также потребуется изменить if nextline == » на if nextline == b» , поскольку:

1 Есть несколько опрятных трюков, которые вы можете сделать с ASCII, которые вы не можете сделать с многобайтовыми наборами символов; наиболее известным примером является "xor с пространством для переключения корпуса" (например, chr(ord(‘a’) ^ ord(‘ ‘)) == ‘A’ ) и "установите 6-й бит для создания управляющего символа" (например, ord(‘ ‘) + ord(‘@’) == ord(‘I’) ). ASCII был разработан в то время, когда манипулирование отдельными битами — это операция с незначительным воздействием на производительность.

2 Да, вы можете использовать аннотации функций, но это сравнительно новая функция и мало используется.

Comments

Copy link Quote reply

leematthewshome commented Jun 25, 2019

Versions

Python: 3.5.3 (running in virtualenv and in root)
(seems to be running Python 3.7.1 on Android based on the logs)
OS: Ubuntu 17:04
Kivy: 1.9.1
Cython: 0.28.6
Andoid 7.0 (on Samsung Galaxy S6)

Description

I have compiled a very basic helloWorld app as below. The app works fine on Ubuntu and also compiles fine into apk using buildozer.

When I run on Android the app immediately closes before displaying the hello world screen. After working through logs produced with adb logcat I see the following.

This appears to be the same issue raised as 1691 under the title "unicode error during startup (python3, numpy, opencv) — patch included #1691". If I am reading this patch correctly the current file _android.pyx should have the following changes

removed this line
python_act = autoclass(JAVA_NAMESPACE + u’.PythonActivity’)

Typeerror must be str not bytes

2 Answers 2

Comments

leematthewshome commented Jun 25, 2019

Versions

Description

Рекомендуем к прочтению

Вертикальная компьютерная мышь: преимущества и недостатки

Ноутбуки на Intel Core i7

Как зашифровать свой компьютер

Добавить комментарий Отменить ответ