If you’re in the business of presenting and demoing technology, you’ll know that there is such a thing as “Demo Gods” who hold your fate in their hands and their sole purpose in life is to throw bugs and issues at you when you’re doing live demos on stage. I had the opportunity to speak at Devoxx Belgium this week and, while all in all the presentation went pretty well, I definitely seem to have made the Demo Gods extra angry since I ran into an extraordinary amount of issues during the live demos.
If you don’t know what Devoxx Belgium is, it is one of the most prestigious developer conferences in the world, especially for Java content, so I wanted to make sure I could deliver a top-notch presentation. The talk, named “Kubernetes. From 0 to Production-Grade with Java”, was about something I’m passionate about, which is making cloud native development easier for Java developers. I asked my colleague Alex Soto to join me and we came up with the idea to do a sort of role-playing presentation/demo combo, where I would be the Java developer and he would be the platform engineer with better knowledge about what a developer should think about when releasing their application to production in a Kubernetes environment.
Since the conference was held at a movie theater, we figured it would be appropriate and funny to create our slides with the theme “Barbenheimer” – a combination of the Barbie movie and the Oppenheimer movie, both of which were popular movies in the theaters this year. We decided that Barbie would be the theme of the Java developer who would work with Quarkus, an easy breezy Java stack that makes cloud native development a lot easier and efficient. The content about good and safe practices on production would be covered by the more serious Oppenheimer theme.
Our session started out quite well. I explained how “traditional” Java is a bit problematic for cloud native use cases due to its relatively slow startup time and large footprint, and how the fantastic Java community has been coming up with several solutions to this problem, one of which being the Quarkus framework. Quarkus allows a much faster startup time and a significantly smaller footprint. I then proceeded to explain how Quarkus also has a great developer experience and demoed how to create a new application and consume messages from a Kafka topic with just a few lines of code and no configuration thanks to Quarkus Dev Services and their integration with Testcontainers.
Up until then everything was running exactly the way it should. I had already averted a first potential issue thanks to my colleague Daniel Oh reminding me that a new version of Quarkus had been released that morning and that I should make sure to cache the new maven dependencies lest I would have to spend a minute or 2 downloading them live on stage. I went ahead and did just that within the last hour before the presentation. (1)
A turn for the worse
The next part of the demo I wanted to show how to package up the application in a container and push it to a registry so I could deploy it to Kubernetes. The container build went fine, but when I tried to push it, I got an error saying I was not authenticated to the registry, even though I had checked just 30 minutes before our talk that I was still logged in. The session must’ve expired just in that small time frame. This has happened before and I usually just go and retrieve the password from my password manager, but this takes time and I didn’t really want 600 people in the audience and however many people watching the recording online see where my passwords are stored so I decided to skip the image push and instead move forward to deploy the Kubernetes manifest to my Kubernetes (Openshift) instance, thinking that it would use a previously pushed container image.
During my demo preparations I had always used the name “devoxx” to create my app – so a container image with that name had indeed already been pushed to my registry previously, but for some reason this time on stage I decided to name my app “my-awesome-app” instead of “devoxx”. Of course that meant there was no “my-awesome-app” container image in the registry, so the deployment to Kubernetes failed because it could not find the corresponding image in the registry. So.. failure number 2 of the day. (2)
I quickly changed the image name in my application.properties to one that I knew did exist and repeated the deployment, which fortunately succeeded this time. We moved forward with our presentation and Alex explained what Kubernetes secrets are and how to use them, after which I showed how easy it is to work with secrets from a Quarkus application with the kubernetes-config extension. I restarted my Quarkus Dev Mode which I had stopped to build the container image previously, and… it failed. Apparently the Kafka Testcontainer failed to start up again when I restarted Dev Mode and this blocked the runtime, so I could not show how the Kubernetes manifests would update on the fly with secret configurations I had added to my application.properties. (3)
Fortunately I had a backup app so I opened it up in another VS Code window and showed in the generated manifest what it would have looked like in the demo app I was building. Not being able to run my app in Dev Mode was problematic though, because it meant my next demos about showing how additional Kubernetes configurations and other capabilities would be added on the fly such as metrics, opentelemetry/tracing, setting limits and requests etc would fail as well. (4) I could not run my backup app in dev mode either, because first I forgot to stop the “my-awesome-app” dev mode – which then blocked port 8080 for the backup app (5), but even after that Dev Mode wouldn’t start up properly because it ran into the same Kafka Testcontainer issue. (6)
I like to think I have pretty good nerves when I’m on stage and things go wrong, but at this point I have to admit I was starting to get a little flustered with so many things unexpectedly failing. This is no fun when you are on stage with 600+ people watching you struggle! After a moment of hesitation I carried on talking and explaining (going silent and try to frantically debug issues gets awkward very quickly for the audience), and, thanks also to Alex filling the silence and suggesting solutions, we came up with the idea to remove the problematic Kafka dependency, which fixed our Quarkus Dev Mode. Hurray! At least we could now show the additional Quarkus extensions and capabilities we were going to add and show off. The “flow” of the demo was off at this point though and there were a few things I didn’t have time to show anymore.
And so we moved on to the thing we were most excited about: a Quarkus-based car racing game, deployed on Kubernetes, where the audience could participate by tapping on their phones to create events that would power the cars in the game. We’ve run this game many times before without major issues, but.. yet again the demo gods were really against us, and for some reason the cars did not move smoothly like they were supposed to. It did end up recovering a little so the impact was relatively minimal this time. (7)
We proceeded with an explanation of how to deploy new features to a “production-grade” app on Kubernetes using CI/CD and GitOps and our final demo was going to show how, by just committing/merging some new code, we could leverage a CI/CD pipeline that would build the application and update a Kubernetes manifest so that a Gitops (ArgoCD) engine would pick up the change and re-deploy our application with a new feature for the audience to shake their phone and use the phone’s accelerometer to create events and power the cars, but again the demo gods interfered and our pipeline would just sit idle and refuse to run. (8)
We were pretty much out of time now so we ended up skipping the final demo. It would have been a fantastic way to leave the audience with a moment of fun and a memorable experience. Bummer.
What happened, and what could we have done to prevent or circumvent it?
Clearly, a lot of things went wrong during the demos of this session, let’s sum them up:
1: A new version of Quarkus was released a few hours before the demo
This one I caught in time and I was able to pull down the updated dependencies. Releases usually happen on a Thursday so next time I should be aware of this and either make sure everything is cached, or better yet, use a specific version I’ve already tested. Except that makes the “quarkus create app” command a little more cluttered, which is why I usually avoid using a specific version in the first place. But … it could also avoid potential other issues with bugs related to a new release.
2: Logged out of container registry
Even though I checked 30 minutes before my session that I was still logged in, I ended up being logged out of the container registry and had to log back in. I think I will change my password to something I will remember, so next time I can just quickly type in the username and password and retry. Alternatively, Podman Desktop is supposed to be able to remember your credentials, so I should see if that works.
3: Kafka Testcontainer failed
During my demo dry runs I had run into a bug related to podman and crun (https://github.com/containers/podman/issues/3024#issuecomment-1733759900) which caused issues with starting up testcontainers with Quarkus Dev Mode. I thought the issue I ran into during the live demo had to do with this, even though I had a workaround and the first Kafka Testcontainer had worked just fine, but I was not about to start debugging and waste precious time we needed for the rest of our session. After the talk I was told by another developer advocate (Sebastien Blanc) that he too had run into this bug and that it was an issue with the Kafka (based on a redpanda container image) Testcontainer itself, that prevents it from starting up again if it has been stopped. To be honest, I had not run into this issue before and I cannot reproduce it now, so I’m still unsure of what really was the cause of the error.
What could I have done to prevent it? I’m not sure… To me this one goes in the category of “Demo Gods gonna be Demo Gods”.
4: Subsequent demos were affected by the Kafka Testcontainer bug
The Testcontainer bug that prevents Quarkus Dev Mode from running correctly affected the next demos as well. I was still in the mindset that the Dev Mode issue was caused by the podman/crun bug and so didn’t think about removing the Kafka dependency. Fortunately Alex made the suggestion to remove it and that allowed me to show the remaining demos. Too bad we only did this later when we had already tried switching to the other project.
What could I have done better? The good thing is that I actually did have a backup project ready. The bad thing is that it ran into the same error. At least I had already generated the Kubernetes Manifest content I wanted to show so even though I didn’t show how it would load “live”, I was able to show it anyway.
5: Forgot to stop Dev Mode on the failed demo project
Ok this one is easy.. I should’ve paid more attention and stopped Dev Mode on the original project before switching to another project. At least I did notice pretty quickly in my backup project’s logs that port 8080 was blocked and therefore Dev Mode was still running somewhere else.
6: Backup project ran into the same Testcontainer issue
Again, if I hadn’t thought the issue was caused by the podman/crun bug maybe I would’ve just ripped out the Kafka dependency sooner and not needed to switch to the backup project. Expecting a different result with the same dependencies was wishful thinking but hey, “have you tried restarting” often works right? 🙂
7: Quinoa Wind Turbine game cars were moving erratically
Though the game did end up working/completing, there was a noticeable slowdown at some point. We did some analysis after the session and suspect it was an IO issue, especially considering the conference network and 400 players sending a lot of concurrent events over the network at a time. We’re making some adjustments to improve the throughput.
8: The pipeline to roll out of the new version was non responsive
The CI pipeline was unresponsive and so we could not roll out the new version of our game. After some analysis after the session it turns out a remote registry hosting the container image needed to run the git clone command was having issues and returning 503 errors. Curse you Demo Gods!! Next time I should have an image tag ready so I can manually update the kubernetes manifest to point to an existing pre-built image with the “new feature”, which would automatically let ArgoCD roll it out. Alternatively, I could have a parallel version of the game already available in another namespace and point the QR code for the second version of the game to that one.
To wrap it up, a lot can go wrong during a live demo. Sometimes it’s in our control, most of the times it’s not. The important thing is to not get too rattled and carry on, and not to let things get awkward for the audience. I think we did a fine job battling our way through the issues and the applause at the end proved that the audience appreciated our efforts (I think!). I’m curious to know what you think I should’ve done better or your experiences in similar situations. Feel free to leave your comments below, or post them on the various social media channels I’m on and tag me.