C++, Implement filter function for refactoring to keep code dry, clean and flexible

Ivan YuIvan Yu
4 min read

You may be asked to add a function to get a list of nested items fulfilling certain condition and there is already another function doing similar thing, but with different condition. You can certainly create a new function to achieve this and ignore the existing one, but this may produce a lot of repeated code. So here, we suggest to use filter function and explain the usage with the following example.

Scenario

Example project: https://github.com/ivanyu199012/13-Refactor-FilterFunctions

Current code is as below (filter1.cpp):

PersonVector getPeopleOldThan18(const CityVector &cityVector);

int main()
{
  const CityVector cities = initData();
  std::cout << "----------------------------" << std::endl;
  std::cout << "People larger than 18:" << std::endl;
  PersonVector personVector = getPeopleOldThan18(cities);
  for (auto person : personVector)
  {
    std::cout << person->getName() << ", Age: " << person->getAge() << std::endl;
  }
}

PersonVector getPeopleOldThan18(const CityVector &cityVector)
{
  PersonVector outputPersonVector;
  for (auto city : cityVector)
  {
    for (auto pPerson : city->getCitizens())
    {
      if (pPerson->getAge() > 18)
      {
        outputPersonVector.push_back(pPerson);
      }
    }
  }
  return outputPersonVector;
}

There is an existing function getPeopleOldThan18() to get a vector of person older than 18 years old and each class City instance contains a vector of people.

City data structure(Model.cpp):

new Person("John", 34, 10000), // Name, Age, Income
const CityVector initData()
{
  std::vector<Person *> londonCitizens{new Person("John", 34, 10000),
                                       new Person("Peter", 54, 20000),
                                       new Person("Mary", 17, 3000)};

  std::vector<Person *> beijingCitizens{new Person("Wong Yin", 32, 10000),
                                        new Person("Xiu Ming", 52, 20000),
                                        new Person("Tai Man", 15, 3000)};

  std::vector<Person *> seoulCitizens{new Person("Mr. Pak", 32, 10000),
                                      new Person("Gilbin", 52, 20000),
                                      new Person("YoungSoo", 15, 3000)};

  return CityVector {new City("London", londonCitizens),
                    new City("Beijing", beijingCitizens),
                    new City("Seoul", seoulCitizens)};
}

Requirement: Add a function to retrieve people with income larger than 15000

You can add a new function getPeopleEarnMoreThan15000() and ignore the existing function as below(filter2.cpp):

// New function
PersonVector getPeopleEarnMoreThan15000(const CityVector &cityVector)
{
  PersonVector outputPersonVector;
  for (auto city : cityVector)
  {
    for (auto pPerson : city->getCitizens())
    {
      if (pPerson->getIncome() > 15000)
      {
        outputPersonVector.push_back(pPerson);
      }
    }
  }
  return outputPersonVector;
}

PersonVector getPeopleOldThan18(const CityVector &cityVector)
{
  PersonVector outputPersonVector;
  for (auto city : cityVector)
  {
    for (auto pPerson : city->getCitizens())
    {
      if (pPerson->getAge() > 18)
      {
        outputPersonVector.push_back(pPerson);
      }
    }
  }
  return outputPersonVector;
}

But except the if statements if (pPerson->getAge() > 18) and pPerson->getIncome() > 15000, remaining code of 2 functions are the same and a lot of repeated code is produced.

Better solution: Filter function

The concept is to add a new function getPeopleWithFilter() with an extra function parameter personFilter, which function's input is Person and output is boolean, and call the function in the if statement if(personFilter(pPerson)).(filter3.cpp)

PersonVector getPeopleWithFilter(const CityVector &cityVector,
 // Function Parameter
 const std::function<bool(Person *)> &personFilter )
{
  PersonVector outputPersonVector;
  for (auto city : cityVector)
  {
    for (auto pPerson : city->getCitizens())
    {
      // Call the filter function
      if (personFilter(pPerson))
      {
        outputPersonVector.push_back(pPerson);
      }
    }
  }
  return outputPersonVector;
}

Call above function with the existing function getPeopleOldThan18() and new function getPeopleIncomeMoreThan15000().

PersonVector getPeopleIncomeMoreThan15000(const CityVector &cityVector)
{
  std::function<bool(Person *)>
      personFilter = [](const Person *pPerson) -> bool
  {
    return pPerson->getIncome() > 15000;
  };
  // Call the above function
  return getPeopleWithFilter(cityVector, personFilter);
}

PersonVector getPeopleOldThan18(const CityVector &cityVector)
{
  std::function<bool(Person *)>
      personFilter = [](const Person *pPerson) -> bool
  {
    return pPerson->getAge() > 18;
  };
  // Call the above function
  return getPeopleWithFilter(cityVector, personFilter);
}

This code has 2 advantages:

  • Keep the code dry (reduce repeated code).
  • Only need to define the filter function and call getPeopleWithFilter() if you need to add another function to retrieve a list of nested item in the future.

Reason I wrote this blog

I tried to share the coding technique I have learnt and used in my work and hope that others can gain advantage from it. Also, I can refer to this blog if I forget the skill in the future.

Maybe some of you notice, I have not wrote any blog for about a year since I have moved to another company from last year and I spend a lot of time to adapt the new environment.

In this company, rather than developing new software, mostly I fix bug, improve and enhance current software. So refactoring code becomes my main task.

I have concluded some important points for refactoring the code.

  • Understand what the code is about.
  • Find out why the code is wrote like this.
  • For fixing bug, discovering root cause is crucial
    • If time is limited, you may fix problem with a fast temporary solution first and later and DON'T FORGET, pinpoint root cause
    • If root cause identified, this CAN probably lead to more serious problem later
  • MINIMIZE CODE CHANGE since more change may produce more bugs or errors. In this blog, we only modify the content of the function getPeopleOldThan18() without changing name, input and output.
0
Subscribe to my newsletter

Read articles from Ivan Yu directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Ivan Yu
Ivan Yu