Fetchmail and Google’s OAuth 2.0 enforcement

This post was written by eli on June 11, 2022
Posted Under: email,Internet,Server admin

Introduction

After a long time that Google’s smtp server occasionally refused to play ball with fetchmail, tons of Critical Alerts on “someone knowing my password” and requests to move away from “Less Secure Apps” (LSA) and other passive-aggressive behaviors, I eventually got the famous “On May 30, you may lose access” mail. That was in the beginning of March.

My reaction to that was exactly the same as to previous warnings: I ignored it. Given the change that was needed to conform to Google’s requirements, I went on with the strategy that “may” doesn’t necessarily mean “will”, and that I don’t move until they pull the plug. Which they didn’t on May 30th, as suggested. Instead, it happened on June 8th.

Since I don’t intend to ditch my Gmail address, it follows that OAuth2 is going to be part of my life from now on, and that I’ll probably have to figure out why this or that doesn’t work out every now and then. So I might as well learn the darn thing once and for all. Which I just did.

This post consists of my own understanding on this matter. More than anything, it’s intended as a reminder to myself each time I’ll need to shove my hands deep into some related problem.

None of the covered subjects have anything to do with my professional activity as an engineer.

Spoiler: My decisions on my own mail processing

This is my practical conclusion of everything written below, so the TL;DR is that I decided to stop using Fetchmail with Google’s mail servers. Instead, I’ve set Gmail to forward all emails to one of my other email accounts, to which I have pop3 access with Fetchmail (it’s actually on a server I control). I just added another transmission leg.

This is mainly because of the possibility that continuing to use Fetchmail with Google’s server will require my personal attention every now and then, for reasons I elaborate on below. It’s not impossible, but I’m not sure Fetchmail with Gmail is going to be a background thing anymore.

Mail forwarding is a solid solution to this. It doesn’t create a single point of failure, because I can always access the mails with Gmail’s web interface, should there be a problem with the forward-to-Fetchmail route. The only nasty thing that can happen is that the forwarding’s destination email address may be disclosed to the sender, if the delivery fails for some reason: It appears in the bounce message.

So if you want to use the forwarding method for an email address that you keep for the sake of anonymity, you’ll have to use a destination email address that says nothing about you either. There are plenty of email services with POP3 support at a fairly low cost.

The only reason I still need to live with OAuth2 support is that emails that I send with my Gmail address must go through Google’s servers, or else they are rejected by a whole lot of mail servers out there by virtue of DMARC.

So I upgraded Thunderbird to a version that supports OAuth2, and it works nicely with Google. I could have fetched the emails with Thunderbird too, but I still want to run my own spam filter, which I why I want fetchmail to remain in the loop for arriving mails.

And now, to the long story.

What is OAuth2?

To make a long story short, it’s the mechanism behind “Login with Google / Facebook / whatever”. Rather than having the user maintain a username and password for every service it accesses, there’s one Authorization Server, say Google, that maintains the capability to verify the actual user.

The idea is that when the user wants to use some website with “Login with Google”, the website doesn’t need to check the user’s identity itself, but instead it relies on the authentication made by Google. As a bonus, the fact that the user has logged into the site with Google, allows the site’s back-end (that is, the web server) to access some Google services on behalf of the user. For example, to add an entry in the user’s Google calendar.

To make this work, the site’s back-end needs to be able to prove that it’s eligible to act on behalf of the said user. For this purpose, it obtains an access token from the Authorization Server. In essence, this access token is a short-lived password for performing certain tasks on behalf of a certain user.

So an access token is limited in three ways:

  • It’s related to a specific Google user
  • It’s limited in time
  • It gives its owner only specific permissions to carry out operations, or as they’re called, scopes.

For the sake of fetching emails, the recent change was that Gmail moved from accepting username + password authentication to only accepting an access token that allows the relevant user to perform pop / imap operations.

Interactions with the Authorization Server

The Authorization Server is responsible mainly for two tasks:

  • The initial authentication, which results in obtaining an access token, a refresh token and various information (in JSON format).
  • The refreshing of the access token, which is performed to replace an expired or soon-to-expire access token with a valid one.

So the overall picture is that it starts with some initial authentication, and then the owner of the access token keeps extending its validity by recurring refresh requests.

The initial authentication is done by a human using a web browser. That’s the whole point. This allows the Authorization Server to control the level of torture necessary to obtain the access token. It may not require any action if the user is already safely logged in, and it may suddenly decide to ask silly questions and/or perform two-factor authentication and whatnot.

Refreshing the access token is a computer-to-computer protocol that requires no human interaction. In principle, access can be granted forever based upon that initial authentication by refreshing the token indefinitely. But the Authorization Server is nevertheless allowed to refuse a refresh request for any or no reason. In fact, this is the way Google can force us humans to pay attention. The documentation tends to imply that tokens are always refreshed, but at the same time clearly state that the requester of a refresh should handle a refusal gracefully by reverting to the browser thing.

Remember those “suspicious activity” notifications from Google, begging us to confirm that it was us doing something on an uknown device? No need to beg anymore. If Google wants us to confirm something, it just denies the token refresh request. The only way to resume access is going back to initial authentication. This brings the human user to a browser soon enough to re-authenticate, which is a good opportunity to sort out whatever needs sorting out.

For example, if Thunderbird is used to access mail from Gmail with OAuth2, it must have the capability to open a browser window in order to perform the initial authentication (which it does nowadays). Hence if a refresh requests fails, this browser window will be opened again for further action. So there’s a means to talk with the human user. This possibility didn’t exist with the old password authentication, because if that failed, the user was prompted for a new password. So there was no reasonable way to initiate communication with the human user by refusing access.

How obnoxious service providers intend to be with this new whip is yet to be seen, but it’s clear that OAuth2 opens that possibility. The fact that access tokens are currently refreshed forever without the need to re-authenticate, doesn’t say how it’s going to be in the future.

As a bit of a side note, it’s common practice that access to cloud services can be made with an initial authentication that doesn’t involve a web browser. This makes sense, as software that consumes these services typically runs on servers with no human around. Today, this can be used to obtain tokens for Gmail access, but I doubt that will go on for long.

The authentication handshake in a nutshell

There are plenty of resources on OAuth2: To be begin with, there’s RFC 6749, which defines OAuth2, and several tutorials on the matter, for example this one. And there’s Google’s page on using OAuth2 for accessing Google APIs, which is maybe the most interesting one, as it walks through the different usage scenarios, including devices that can’t run a web browser.

This way or another, it boils down to the following stages for a website with “Login with X”:

  • A web browser goes to the Authorization Server with a URL that includes information about the request, by virtue of a link saying “Login with Google” or something like that. It’s typically a very long and tangled URL with several CGI-style parameters (it’s a GET request).  Among the parameters in the link, there’s the client ID (who is requesting access), what kind of access is required from Google’s servers (the scopes) and to what URL the Authorization Server should redirect the browser when it’s done torturing the human in front of the browser. For example, the link used by TikTok’s “Continue with Google” goes
    https://accounts.google.com/o/oauth2/v2/auth/identifier?client_id=1096011445005-sdea0nf5jvj14eia93icpttv27cidkvk.apps.googleusercontent.com&response_type=token&redirect_uri=https%3A%2F%2Fwww.tiktok.com%2Flogin%2F&state=%7B%22client_id%22%3A%221096011445005-sdea0nf5jvj14eia93icpttv27cidkvk.apps.googleusercontent.com%22%2C%22network%22%3A%22google%22%2C%22display%22%3A%22popup%22%2C%22callback%22%3A%22_hellojs_5kkckpps%22%2C%22state%22%3A%22%22%2C%22redirect_uri%22%3A%22https%3A%2F%2Fwww.tiktok.com%2Flogin%2F%22%2C%22scope%22%3A%22basic%22%7D&scope=openid%20profile&prompt=consent&flowName=GeneralOAuthFlow
  • The Authorization Server does whatever it does in that browser window, and when that ends, it redirects the browser with a 302 HTTP redirect to the URL that appeared in the request. It appends a CGI-style “code=” parameter to the URL, and by doing that it gives the back-end server an authorization code. If there was a “state” parameter in the link to the Authorization Server, it’s copied as a second parameter in this redirection. This is how the back-end server knows which request it got a response for.
  • Now that the back-end server has the authorization code, it contacts the Authorization Server directly over HTTP, and requests access tokens, using this code in the request. The Authorization Server responds with a JSON string, that contains the access token, the refresh token and other information.
  • Using the access token, the back-end server can access various Google API servers.
  • Using the refresh token, the back-end server can obtain a new access token (and possibly a new refresh token) when the existing access token is about to expire. Refresh tokens have no given expiration time, but if a new one is obtained during refresh, it should be used in following refresh requests.

It may be required to add additional credentials in requests for an access token (i.e. along with an authorization code or a refresh token), namely the client_id and client_secret parameters. These credentials are relevant in particular with cloud applications, and they are obtained when registering for such.

So this was the scenario for a website. What about fetching mails with Thunderbird and alike? It’s basically the same principle, only that the redirection with the authorization code is handled differently. There are several other variations, depending on the capabilities of the device that needs access. Among others, there’s a browser-less option for cloud applications, which is once again a variant of the above.

As for Thunderbird and other MUAs, they take the role of the back-end server: If they don’t have a valid access token, they open a browser window with the Authorization Server’s URL, with all necessary parameters. The redirection to the website is done differently, but it boils down to Thunderbird obtaining the authorization code and subsequently using it to obtain the access token. And then refreshing it as necessary.

So to summarize: There’s a browser session that ends with an authorization code, and the application uses this authorization code to get an access token. This access token is effectively a short-lived password that is used with Google’s API servers, Google’s smtp server included.

And by the way, there’s a maintained Perl module for OAuth2. I don’t know if I should be surprised about that.

fetchmail and OAuth2

Fetchmail 7 is apparently going to to support OAuth2, but there’s little enthusiasm for supporting it on the long run. It also appears like OAuth2 will not be backported to fetchmail-6.x.x.

To Fetchmail, the authentication tokens are just a replacement for the password. It’s another secret to send away to the server. So the entry in .fetchmailrc goes something like this:

poll <imap_server> protocol imap
  auth oauthbearer username <your_email>
  passwordfile "/home/yourname/.fetchmail-token"
  is yourname here
[ ... ]

For this to work, there must be a mechanism for keeping the token valid. The mechanism suggested in fetchmail’s own git repository is that a cronjob first invokes a Python script that refreshes the token if necessary (and updates .fetchmail-token). Fetchmail is then called (in non-daemon mode) as part of this cronjob, and does its thing.

The approach for making this work automatically is to rely on the API for Google and Microsoft’s cloud services, which is intended for allowing scripts to access these services in a safely authenticated way. It seems to be an attempt to avoid the browser session at all costs. Which is understandable, given that fetchmail is traditionally a daemon that works silently in the background.

However using fetchmail like this requires registering the user as a Google cloud API user, which is quite difficult and otherwise annoying. So I can definitely understand the lack to of enthusiasm expressed by Fetchmail’s authors (more on that below).

But I beg to differ on this approach. The browser session is what Google really wants, so there’s no choice but to embrace it. Since my own motivation to use fetchmail is zero at this point, I didn’t implement anything, but this is what I would have done. And maybe will do, if it becomes relevant in the future:

A simple systemd-based daemon keeps track on when tokens expire, and issues refresh requests as necessary. If a valid token for a Gmail account is missing (because the refresh requests failed, or because an account was just added), this daemon draws the user’s attention to the need for an authentication session. Maybe a popup, maybe an icon on the system tray. When the user responds to that alert, a browser window opens with the relevant URL, and the authentication process takes place, ending with an authorization code, which is then turned into a valid token.

As for Fetchmail itself, it keeps running as usual as a daemon, only using access tokens instead of passwords. If a token is invalid, Google’s server will reject it, and if that goes on for too long, Fetchmail issues the warning mail message we’re probably all familiar with. Nothing new.

This doesn’t require any registration to any service. Just to enter the username and password the first time the daemon is launched, and then possibly go through whatever torture Google requires when it gets paranoid. But this is the way Google probably wants it to work, so no point trying to fight it. Frankly, I don’t quite understand why the Fetchmail guys didn’t go this way to begin with.

Future of OAuth2 support

Personally, I think Fetchmail should support OAuth2 authentication to the extent that it’s capable of using an access token for authentication. As for obtaining and maintaining the tokens, I can’t see why that has anything to do with Fetchmail.

The authors’ view is currently somewhat pessimistic. To cite the relevant entry in the NEWS file:

OAuth2 access so far seems only to be supported by providers who want to exert control over what clients users can use to access their very own personal data, or make money out of having clients verified. There does not appear to be a standard way how service end-points are configured, so fetchmail would have to carry lots of provider-specific information, which the author cannot provide for lack of resources.

OAuth2 is therefore generally considered as experimental, and unsupported, OAuth2 may be removed at any time without prior warning.

As for their affection for OAuth2, see the preface in README.OAUTH2 file. This file nevertheless explains how to obtain an OAuth2 client id and client secret from Google and Microsoft. Something I suggested to skip, but anyhow.

App passwords

This isn’t really related, but it’s often mentioned as a substitute for OAuth2, so here are a few words on that.

It seems like there’s a possibility to generate a 16-digit password, which is specific to an app. So at least in theory, this app password could be given to Fetchmail in order to perform a regular login.

I didn’t pursue this direction, mainly because the generation of an app password requires two-step verification. Forwarding sounds so much nicer all of the sudden.

Besides, I will not be surprised if Google drops App passwords sooner or later, in particular for Gmail access.

Summary

I can’t say that I’m happy with OAuth2 becoming mandatory, but I guess it’s here to stay. My personal speculation is that it has become mandatory to allow Google to re-authenticate humans gracefully, possibly with increasingly annoying means. This is a fight against spammers, scammers and account hijackers, so paranoia is the name of the game.

Apparently, forcing the owner of the Google account into an authentication session, either with a browser on the desktop or on the mobile phone, possibly both combined, is the future weapon in this fight. It’s quite annoying indeed, but I guess there are worse problems on this planet.

Reader Comments

It is oftentimes a good idea to wait a few weeks after the uproar and *then* read some concise summary of facts and conclusions.

Thank you Eli, I bookmark this page.

Although I am myself not affected by OAuth2, I do use fetchmail. Your text is very clear as well as a good starting point, should I need to dig deeper into some specific aspects. “There are worse problems on this planet”, but I have underestimated the scope of the problem …

Cheers from France.

Michael

#1 
Written By Michael Uplawski on June 14th, 2022 @ 10:22

Add a Comment

required, use real name
required, will not be published
optional, your blog address