PyTA Project: Checking Pycodestyle Error in test_examples.py

RaineRaine
5 min read

This article is a continuation of the previous task (https://raineyang.hashnode.dev/pyta-project-refactoring-testcheckondir-in-testcheckpy), in which I left the problem that the refactored test_examples module does not check for PEP8 errors due to different file name format. In this article, I will develop a unit test that specially tests PEP8 errors on PythonTA.

The Test Cases

Test cases for PEP8 errors are located at examples/custom_checkers_e9989_pycodestyle directory. Each file name starts with the PEP8 error code, followed by an indication whether the file contains a PEP8 error or follows the convention (no error).

Similar to the implementation in test_examples_filles_pyta, we can get the test cases and error code by examining the file name:

_PYCODESTYLE_PATH = "examples/custom_checkers/e9989_pycodestyle/"
_PYCODESTYLE_PREFIX_REGEX = r"^e\d{3}"

@pytest.mark.parametrize("test_file", get_file_paths(_PYCODESTYLE_PATH))
def test_pycodestyle_errors_pyta(test_file: str, check_pycodestyle_pyta: Dict[str, Set[str]]) -> None:
    """
    Run python_ta on all pycodestyle error test cases
    """
    base_name = os.path.basename(test_file)
    if not re.match(_PYCODESTYLE_PREFIX_REGEX, base_name[:4]):
        return
    if not base_name.lower().endswith(".py"):
        assert False

    # get the specific PEP8 error code
    has_error = base_name[5:] == "error.py"
    error_code = base_name[:4].upper()

After that, we want to check two aspects of PythonTA's output: first, whether an pep8-errors message is reported,. Second, whether the error code reported correspond to the one in the file name.

    if has_error:
        found_pycodestyle_message = "pep8-errors" in file_symbols
        assert found_pycodestyle_message, f"Failed {test_file}. File does not add expected message."

As shown in the code above, the first case is easy to handle. However, the specific PEP8 error type is not shown in the message symbol. By observing PythonTA's json message , we notice that the error message ("msg") contains the error code. Thus, if we are able to retrieve the full message in addition to message symbol, we are able to determine whether the specified PEP8 error is reported.

[
    {
        "filename": "C:\\University of Toronto\\Projects\\PythonTA Project\\pyta\\examples\\custom_checkers\\e9989_pycodestyle\\e115_error.py",
        "msgs": [
            {
                "msg_id": "E9989",
                "symbol": "pep8-errors",
                "msg": "Found pycodestyle (PEP8) style error E115 at line 2, column 0: expected an indented block (comment)",
                "C": "E",
                "category": "error",
                "confidence": [
                    "UNDEFINED",
                    "Warning without any associated confidence level."
                ],
                "abspath": "C:\\University of Toronto\\Projects\\PythonTA Project\\pyta\\examples\\custom_checkers\\e9989_pycodestyle\\e115_error.py",
                "path": "e115_error.py",
                "module": "e115_error",
                "obj": "",
                "line": 2,
                "column": 0,
                "end_line": null,
                "end_column": null,
                "snippet": "    1  if True:\n    2  # No indented block follows the colon  # INDENT THIS LINE\n    3      pass\n    4  \n",
                "line_end": null,
                "column_end": null,
                "number_of_occurrences": 1
            },

Refactoring symbols_by_files_pyta

Now we want to make the following modifications to symbols_by_files_pyta function:

  1. Passing the path to pycodestyle errors, instead of examples, into PythonTA

  2. Adding an option to include full messages, in addition to message symbols, to the returned dictionary.

Both changes require us to pass arguments to symbols_by_files_pyta function. However, we are invoking symbols_by_files_pyta indirectly through the use of @pytest.fixture decorator, which automatically invokes the function without arguments before every unit test. To solve this problem, we can make symbols_by_files_pyta to be a helper function, and create two separate pytest fixtures that runs symbols_by_files_pyta with different parameters.

@pytest.fixture(scope="session")
def check_examples_pyta() -> Dict[str, Set[str]]:
    """
    Checking the examples files with python_ta
    """
    return _symbols_by_file_pyta([_EXAMPLES_PATH, _CUSTOM_CHECKER_PATH])


@pytest.fixture(scope="session")
def check_pycodestyle_pyta() -> Dict[str, Set[str]]:
    """
    Checking the pycodestyle error files with python_ta, including the error message to
    check for the specific error type
    """
    return _symbols_by_file_pyta([_PYCODESTYLE_PATH], include_msg=True)

Inside _symbols_by_file_pyta method, we changes the module_name parameter of python_ta.check_all() to the argument paths . In addition, when include_msg is set True, the full message will be added into the message symbol dictionary.

Here's another change worth noting. In the previous implementation, I used msgs[0]["path"] to get the file name of test case. This would fail when the msgs is an empty list, which does not occur for other test cases but would happen for no_error cases in PEP8 errors. Instead, we can get filename attribute of json message, which correspond to the file's full path, and then use os.path.basename() to get the file name from path.

def _symbols_by_file_pyta(paths: List[str], include_msg: bool = False) -> Dict[str, Set[str]]:
    """
    Run python_ta on all the example files and return the map of file name to the
    set of PythonTA messages it raises.

    :param paths: The paths to retrieve the files from.
    :param include_msg: whether to include msgs[msg] in the symbol set
    """
    sys.stdout = StringIO()
    python_ta.check_all(
        module_name=get_file_paths(paths),
        config={
            "output-format": "python_ta.reporters.JSONReporter",
        }
    )

    jsons_output = sys.stdout.getvalue()
    sys.stdout = sys.__stdout__
    pyta_list_output = json.loads(jsons_output)

    file_to_symbol = {}
    for path, group in itertools.groupby(pyta_list_output, key=lambda d: os.path.basename(d["filename"])):
        symbols = set()
        for message in group:
            for msg in message["msgs"]:
                symbols.add(msg["symbol"])
                if include_msg:
                    symbols.add(msg["msg"])

        file = os.path.basename(path)
        file_to_symbol[file] = symbols

    return file_to_symbol

Finally, we are able to test whether the error code indicated in the file name occurs in error message:

assert any(error_code in msg for msg in file_symbols), f"Failed {test_file}. The correct PEP8 error type is not in reported message."

The Complete Code

Here is the complete modified code for this change:

_PYCODESTYLE_PATH = "examples/custom_checkers/e9989_pycodestyle/"
_PYCODESTYLE_PREFIX_REGEX = r"^e\d{3}"


def _symbols_by_file_pyta(paths: List[str], include_msg: bool = False) -> Dict[str, Set[str]]:
    """
    Run python_ta on all the example files and return the map of file name to the
    set of PythonTA messages it raises.

    :param paths: The paths to retrieve the files from.
    :param include_msg: whether to include msgs[msg] in the symbol set
    """
    sys.stdout = StringIO()
    python_ta.check_all(
        module_name=get_file_paths(paths),
        config={
            "output-format": "python_ta.reporters.JSONReporter",
        }
    )

    jsons_output = sys.stdout.getvalue()
    sys.stdout = sys.__stdout__
    pyta_list_output = json.loads(jsons_output)

    file_to_symbol = {}
    for path, group in itertools.groupby(pyta_list_output, key=lambda d: os.path.basename(d["filename"])):
        symbols = set()
        for message in group:
            for msg in message["msgs"]:
                symbols.add(msg["symbol"])
                if include_msg:
                    symbols.add(msg["msg"])

        file = os.path.basename(path)
        file_to_symbol[file] = symbols

    return file_to_symbol


@pytest.fixture(scope="session")
def check_examples_pyta() -> Dict[str, Set[str]]:
    """
    Checking the examples files with python_ta
    """
    return _symbols_by_file_pyta([_EXAMPLES_PATH, _CUSTOM_CHECKER_PATH])


@pytest.fixture(scope="session")
def check_pycodestyle_pyta() -> Dict[str, Set[str]]:
    """
    Checking the pycodestyle error files with python_ta, including the error message to
    check for the specific error type
    """
    return _symbols_by_file_pyta([_PYCODESTYLE_PATH], include_msg=True)


@pytest.mark.parametrize("test_file", get_file_paths(_PYCODESTYLE_PATH))
def test_pycodestyle_errors_pyta(test_file: str, check_pycodestyle_pyta: Dict[str, Set[str]]) -> None:
    """
    Run python_ta on all pycodestyle error test cases
    """
    base_name = os.path.basename(test_file)
    if not re.match(_PYCODESTYLE_PREFIX_REGEX, base_name[:4]):
        return
    if not base_name.lower().endswith(".py"):
        assert False

    # get the specific PEP8 error code
    has_error = base_name[5:] == "error.py"
    error_code = base_name[:4].upper()

    test_file_name = os.path.basename(test_file)
    file_symbols = check_pycodestyle_pyta[test_file_name]

    if has_error:
        found_pycodestyle_message = "pep8-errors" in file_symbols
        assert found_pycodestyle_message, f"Failed {test_file}. File does not add expected message."
        assert any(error_code in msg for msg in file_symbols), f"Failed {test_file}. The correct PEP8 error type is not in reported message."
0
Subscribe to my newsletter

Read articles from Raine directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Raine
Raine