Unleash the full power of Flutter widget tests

If you’re a Flutter developer, you may have written widget tests before. But have you unleashed their full power?

Judging by the Flutter developers I have worked with the last couple of years, I would venture it’s a “no”… Not because of lack of skills, I have worked with many excellent developers, but because best practices regarding testing take a little longer to establish when a framework is still “new”.

The common approach

Widget tests are often written as a unit test for a widget. The widget is launched, all the data passed to the widget is mocked, any state management used by the widget is mocked, and there you have it, a unit test for a widget.

For example, you may have a widget that has a couple of parameters: a string title and a onTap function. Your “unit” widget test would test the function is called when the title is tapped. That’s great if you develop a UI library, but let’s face it, most bugs do not reside at that level, most bugs reside in the interactions between widgets and/or (async) data.

Unleash the full power of widget tests

Have you ever maintained a UI test suite with 1000+ tests running on a device? Well, in my last job as a native Android developer, we had 1200 UI tests (Espresso). They were a life saver, absolutely, but they took 16 minutes on 28 shards on one virtual device on Firebase test lab, at a cost of $5. Not only it slows down CI and it is costly, but also, it meant we could never run the whole suite on one of our devices. If you work on an isolated feature, that’s fine, but what about a large refactoring job that potentially affects a lot of feature? Painful. Very painful.

With Flutter, we are extremely lucky that we can write UI tests that do not need a device to run. Widget tests can cover 90% of those tests you’d normally run on a device for native code. They run at a fraction of the time and without complicated and costly CI integration with a device test lab.

How? Just launch the main app widget. Simple as that.

What, launching the whole app each time? Yep.

Are you sure? Yep.

Remember, those tests run fast. It doesn’t matter that you need 5 taps to navigate to the screen you actually want to test. It will happen super fast.

What do you mock in those tests then? Only what lives outside Flutter. That’s the json sent by the server, and anything sent other to your Flutter code over a native channel.

Before diving further, let’s agree on a terminology for the rest of this article. I will call this kind of widget test a “full widget tree” test. There may be a better word for it, and feel free to suggest it. Until then, this is what I call it 🙂

Why bother with “full widget tree” tests? What aren’t “unit” widget tests enough?

“Full widget tree” tests enable you to:

  •  refactor your code safely. They test your app functionality as experienced by the user, they won’t break if you merge two widgets, or change their parameters, or refactor how you handle state management.
  • test your navigation. A lot of bugs happen between widgets.
  • document how your widget should work. No need to track down 5 layers to find out if error code INVALID_GIFT_VOUCHER is handled, you can check your widget tests and see if you have a test for it.
  • increase your test coverage super quickly, as they exercise a lot of code.

How to gain full power from “full widget tree” tests?

A few tips:

  • test for what the users see, not implementation details. So, generally, aim to find widgets by a string or icon, not a key or a widget type. There are exceptions, but most of your tests should test for strings and icons.
  • spend time setting up helper methods for writing widget tests. Eg no one wants to tap then pump a widget, it’s not concise, create a helper method for that.
  • spend time setting up a way to easily mock a specific API call with a specific json file. Also set up a way to mock different status codes from server, including error codes your server may send, as well as IO exception.
  • spend time setting up helper methods for common initial scenarios that rely on some data saved on device, such as a logged in user.

When to use a “unit” widget test?

I’m not against “unit” widget tests. Really, I’m not.

They are great for:

  • your own library of UI widgets you use throughout your app, or that other people  may use in their app.
  • in app form data validation. Though you can do that in a full widget tree test too, there is no harm launching only the form widget for those tests.
  • a widget you specifically want to test on a different screen size, but you don’t want to test rest of app on that screen size (changing screen size will affect need for scrolling etc).

Should you test everything?

Ultimately, maybe. But not yet.

Things to prioritise:

  • error scenarios
  • core paths in your app
  • core information on the core screens of your app, including being able to scroll to it if information is further down
  • strings should be in team’s main language (for test readability)
  • pick a common screen size
  • if your app uses timezones, please test those. A lot. They are a major source of bugs (I have work on 3 such apps, timezones are hard to reason with, no matter the programming language).

Then, you can add:

  • loading states
  • non core screens
  • less common data scenarios
  • strings in other languages supported by app
  • analytics. Eg tap a button, does your app call your analytics logger?
  • error reporting. Eg throw some gibbering json at a screen, does your app call your error reporting logger?

    When to write “full widget tree” tests?

    As soon as you can. From the start if you’re lucky enough to work on a new app, or as soon as you join a project if it’s existing codebase.

    Whatever you do, do not refactor a legacy codebase without such tests. Please, just don’t. Most unit tests and “unit” widget tests will need to be refactored and be therefore completely useless if you undertake a non trivial refactoring.

    But you know what, it’s not so much work. OK,  you need to set up a few things at first, but having such tests mean you can bypass writing some unit tests.

    If you use Blocs and you’re not developing a library, the only Blocs you need unit tests for are those used by many widgets, such as connectivity status or user login changes.

    You can also bypass unit tests for repositories, and for your json parsing, as all those layers will be exercised in your “full widget tree” tests.

    There is no need to test that your data source code throws InvalidGiftVoucherException when you have a “full widget tree” test that can simulate a server response with status code 403 and error code INVALID_GIFT_VOUCHER. After all, the user doesn’t care that your code throws InvalidGiftVoucherException, what matters is that the app displays  Sorry, we do not recognise this gift voucher code .

    This is why those tests increase your test code coverage quickly, you basically exercise all the layers.

    I know, I know, it runs counter productive to most of the advice we hear as developers. We learn to layer our code and separate concerns, and we often assume that if we unit test everything, then everything will work fine. Except, it does not. Most bugs happen when two things that go together don’t work together properly.

    What to use integration tests for?

    For full integration. With the device AND with the server. Eg don’t mock the json, actually make the server calls to your staging server for your core screens.

    Conclusion

    When in doubt, let the principle of “user experience” guide you.

    Most apps are UI code. The rest of the code is there to put the right thing on the UI so the user can see it and to process the actions a user takes through the UI. An app launches from an entry screen on a device. A Flutter app maintains a widget tree, and those can get quite complex. Therefore, for most apps, we need tests that mimic the real conditions as closely as possible, while also keeping in mind the other goals of clean code, such as easy to maintain code  and fast feedback loops (hence widget tests rather than on device tests).  This is why I think we all should write more “full widget tree” tests 🙂

Author: Natalie Masse Hooper

Mobile app developer with 14 years experience. Started with the Android SDK in 2010 and switched to Flutter in 2017. Past apps range from start ups to large tech (Google) and non tech (British Airways) companies, in various sectors (transport, commercial operations, e-commerce).

5 thoughts on “Unleash the full power of Flutter widget tests”

  1. This is really a great article. I heard similar advice in this talk https://www.youtube.com/watch?v=2WG32k9_c48 where he mentioned having a 60-40% ratio of test where 60% is end to end and 40% unit or integration tests.

    I personally follow this approach when I am working on a codebase where I don’t have tests at all. Because writing just one single end to end covers a lot of code and helps me to move faster.

    However, If I am working on a codebase with a good amount of tests where I don’t have to worry much about breaking the existing behavior (using the existing tests as a safety net). I prefer unit tests. As the name “unit” suggest, it will help me to test small class and functions.

    For example, If I have a screen where I’ve to search contact with a given filter. I would rather have a unit test of the searching contacts in the list with given filters rather than loading the whole contact search widget. This kind of test covers basic string validation, regex validation, loop filters, transformation, and parsing (which I think most of the bugs are found)

    1. Thanks for sharing your experience.

      The video you’ve shared is about Android tests, where end to end tests are integration tests, which have to run on device. This is costly, both in time and money (device farm on CI). But, with Flutter, we have widget tests, which run super fast. Therefore, IMHO, there is no downside to using “full widget tree” tests as I described, because the usual downsides of UI tests don’t apply.

      Yes, you can do unit widget test for a form validation, but most bugs are in how widgets interact with each other (or rather, don’t interact as they should). You lose out on those bugs if you stick to unit widget tests. So, even if an app is full of such tests, I do not have confidence. Likewise for tests that check for a widget of a given type or a given key is shown: widget may be shown, but are the actual expected strings and icons shown on the screen? I have seen many apps where developers believed too much in the power of such tests and naively thought their tests covered everything, and were very surprised by crashes reports they were getting. I want the tests to be as close to what the user experiences as possible, and the user doesn’t launch your LoginPage widget on its own, the user navigates to it from launching the app.

      Test code is code, it should be looked after, it’s a time investment, it’s a lot of effort because it needs to be kept clean, high standards need to be applied if you want maintainable tests, and many developers sometimes feel they can’t afford to invest that time. However, it’s a false economy if you’re developing a stable app, especially because it is actually faster to write full widget tree tests once you’ve set things up for them, than unit widget tests. Of course, for prototypes or start ups that are just testing the market, you don’t need such tests, you should focus on launching and gathering user feedback to validate your app before worrying about tests.

  2. Very informative! I am still new to flutter, but this is extremely helpful. Sounds like it’s coming from a very experienced programmer, a very well-written guide!!

Leave a Reply

Your email address will not be published.