Earlier this week I presented a session at the Atlanta SharePoint User Group on Building Client Side Web Parts with the SharePoint Framework. I had my session planned well in advance, out but wanted to build the demos while they were fresh in my head. Last week I was out of work mode with my family relaxing on a beach in Mexico for Spring Break, so I planned to do build my demos yesterday morning, on Sunday. While we were supposed to return home Saturday night, we missed our international connection and got stuck overnight, not making it home until about 12pm EDT Sunday.
No problem, I have all of Sunday afternoon. I sat down at my desk around 2pm to get in the work mindset and see what changed while I was away. Thankfully no big releases or news in the SharePoint Framework (SPFx) development world or SharePoint Online issues. By 3:30pm EDT I started to build my demo and immediately hit a brick wall.
After creating the project scaffolding with Yeoman & getting all the Node packages, the build process failed with 25+ errors. Immediately I thought “I bet we’ve hit lodashgate #394 again where something upstream changed underfoot”. Ends up that’s what it was, but it took a lot longer to find it. I spent about an hour trying it on two different machines (one MacOS, one Windows 10), different versions of Node.js and hitting the same thing as well as exercising my Google-fu, the error was consistent and not mentioned anywhere. I reached out to a few friends who happened to be online and confirmed I wasn’t the only one (thanks Rob Windsor!).
TL;DR
So what was it? An npm package that the SharePoint Framework build toolchain depends on for transpiling TypeScript to JavaScript (@microsoft/gulp-core-build-typescript) was updated (to version 2.4.0) and caused the SPFx to break. The irony this time around was that change was introduced sometime after 3:00pm EDT. Go figure… this time Microsoft broke their own build toolchain and it happened right before I started working on my demo. Arg!
The resolution was simple: after installing all npm packages you had to install the previous working version of the package (version 2.3.0). Once you did that, everything was resolved.
Opened an issue with the repo & workaround steps and came to find out this morning that MSFT fixed the issue sometime the night before in version 2.4.1, around 7-9 hours after introducing it. So it’s now no longer an issue.
Unfortunately it could have been avoided as I’ll explain at the end of this post.
But what if this happens to you? How can you identify the problem, fix it and move on to being productive?
Identifying the Problem
The first step is to identify the problem. Node.js, just like other modern languages like the .NET Framework, Go, Python and others, leverage package managers to pull reusable code into a project. In npm, we use npm to get dependencies and the SPFx development build toolchain.
The problem can arise though if one of the dependencies you are taking changes in a way that is not compatible or causes an issue for your project. Sometimes you can look at an error and quickly see what the problem was, but this time that didn’t work. Here’s what my console looked like when I hit it… ah… red text in the terminal is not something you ever want to see:
The actual error code can be seen here in this gist.
So now I have to look at what changed to cause the error. It isn’t hard to figure it out, but it can be tedious… it’s really just a case of process of elimination. Look at what changed since it last worked and look at each change, one by one to see which one introduced the issue.
How Does this Happen?
Unfortunately, the engineering team at Microsoft who gave us the SharePoint Framework has made three decisions, or didn’t make the following decisions depending on how you look at it, that complicate this process as well as make them prone to these problems in the future:
- Large List of Dependencies: Ever look at the number of packages the most basic SPFx client side web part requires? There are over 873 in excess of 300 MB. However, your SPFx project only references a handful. Therefore, a significant number of dependencies are nested of which you can’t control.
- Lack of Explicit Versions: The core dependencies use wildcards, as do the majority of the dependencies. Therefore the packages you get may be different today compared to last week if an update was shipped. Consider the fact not everyone respects or adheres to semver so it’s hard to know if there was an updated package with breaking changes.
- No Shinkwrap: npm provides the ability to shrinkwrap packages. Due to the large dependency tree with SPFx projects, if Microsoft elected to shrinkwrap, it would effectively freeze the versions of all packages in an SPFx project. Once a project is created, developers can elect to use the shrink wrapped build toolchain or how we do it today by getting the latest available.
We have already seen this issue in the past… maybe you heard about the latest scandal lodash-gate? Same issue.
Let me be clear on one thing, this issue isn’t exclusive to the SharePoint Framework, Node.js or any even .NET, Go, Python… or any framework that uses package managers and takes dependencies. Any project that takes dependencies on something is at risk for changes upstream to impact their project. You may say “yeah, but Node.js takes more dependencies than my .NET project so Node.js is worse”. I’d counter that point with “well Node.js is a newer framework than .NET and the development community has shown they prefer a slimmer framework with more addons (Node.js) then a bigger framework with more stuff included (.NET)… in fact, the .NET Core takes the approach of Node.js”. This issue shows up when you see a project with a lot of dependencies. Unfortunately right out of the gate, the SPFx default project starts out with a long list of dependencies, making it more vulnerable to these issues.
Finding the Problem Package
As I said above, it’s a tedious process. You need to find the packages that changed since the last time things were working. What you need is a previously working version of an SPFx project. I have a recurring task reminding me to create a new project with the generator every two weeks, run npm install, and verify gulp serve works. If so, that’s my most recent working copy.
Next, I take this working copy, move it to a new folder, make sure the node_modules folder is not excluded from git (just delete the .gitignore file if it’s there) and create a new git repo locally by running the following command
# remove gitignore file that excludes lots of files...
rm .gitignore
# create a new git repo
git init
# add all files to the repo
git add -A
# commit all changes to the repo
git commit -m "works on my machine"
At this point, you’ve got a saved working snapshot you can jump back to. So the next step is to update everything to trigger the error again. This is easy, just run npm install and let npm grab all it’s stuff.
Once that finishes, confirm the error surfaces by running gulp serve or just gulp build.
Now, we need to see what changed… for this, I just use the git diff capabilities to see what changed underfoot. You are really only interested in the package.json files within each package.
Look at each package that changed. What you need to do is revert back to the previous version. So in the following case, we see the resolve package changed from version 1.1.7 to 1.3.2 (I’m using Visual Studio Code to see the changes):
Let’s revert this package:
# remove the updated package; this isn't necessary as installing a specific version will overwrite it
npm uninstall resolve
# install the previousl version
npm install [email protected]
Now try gulp build. Did the error go away? No? You need to repeat these steps as you move onto each package, checking to see if it’s the issue. Before moving on, though, I like to stop and save the changes I’ve made so far. Because this is a git repo, it makes things easy:
git add -A
git commit -m "verified package ok - [package-name]"
But what if the error did go away? That means…
You found the problem package!
That means your workaround fix is as follows:
- Create new project using the Yeoman SPFx generator
- Run npm install to get all the packages
- Reinstall the older version of the problem package using the command above
Since you found the issue, you should also log this as a bug in the SharePoint Framework repo issue list. You can follow the template I used from the incident I hit that prompted me to write this post: #501: BUG: SPFx Builds Fail after March 19 ~1900UTC | gulp-core-build-typescript update breaks builds.
But This Was All Avoidable
Microsoft has said that shrinkwrapping is on their backlog. No telling when or if it will be implemented. Since this happened before and caused havoc with lodashgate well before GA, I was hoping to see that done at GA… yet here we are again.
Until then, I hope these steps will help you if you run into this in the future. I have every reason to believe history will repeat itself especially considering a large number of dependencies in the dependency tree the SPFx engineering team has imposed upon us with their development toolchain.