User Details
- User Since
- Aug 17 2015, 6:48 PM (483 w, 6 d)
- Availability
- Available
- IRC Nick
- phedenskog
- LDAP User
- Unknown
- MediaWiki User
- PHedenskog (WMF) [ Global Accounts ]
Fri, Nov 22
This has finally been fixed. The root cause was a functionality called "search-suggestions" that do a couple of requests back to Mozilla and that is done during the test and disturb the rest of the test. I got help on how to turn it off using a Firefox preferences. I'm gonna roll it out tomorrow together with Firefox 132 so we test on the latest browser and can take a new baseline on Sunday. Will close it when the fix is out.
It's been confirmed that Firefox 132 added some new requests for search suggestions that happens at the same time as the tests. Waiting for Mozilla to get back how to disable those requests.
Thu, Nov 21
Wed, Nov 20
I'm wondering if flush() directly in the code (yes it will make it uglier) can make the page start to render earlier? It seems like it's still a lot of time before the browser starts to consider to render the page and I'm thinking that is because the server holds the on to the data?
I added dashboards but not the actual alerts for this yesterday for the English Wikipedia. The thresholds stopped me. Like we probably do not want to alert on just 1 element increase recalculate style. I think an approach would be that if we have a significant change both in number of elements and in time spent in recalculate time, we fire an alert. I can add that later this week.
Mon, Nov 18
I did a quick check and those metrics are already used in the calculations, so it just a matter of setting up the dashboard. Then you can choose between what's categorised as a small/medium/large regression, it is easier to tune.. Let me setup an example and then we can have a look.
Let me see if we can use the metrics with Mann Whitney U, then we don't need to set a limit. I'll have look tomorrow and get back.
For the late loading of banners, the performance team did a study 2019 how it affects user satisfaction: https://techblog.wikimedia.org/2019/06/13/performance-perception-the-effect-of-late-loading-banners/
This has been resolved, metrics looks ok.
Lets T373172 decide which user journeys that will be implemented.
This has already been done!
I still need to to update the server version, let me do that today.
Fri, Nov 15
I think we don't need to spend time on this. It's not perfect but lets keep it as is.
These are fixed.
This is not an issue anymore
I think we should point to others documentation. Here's another good thing on how to annotate traces: https://developer.chrome.com/blog/devtools-annotations
I think those has been fixed right?
This has been added to the on demand testing documentation.
Lets this be a part of the user journey creation if this one is prio.
This is implemented (except the device lab part).
I think this is because we end the test too early. We also run some banners with a 7.5 second delay.
I added the alerts when Google fixed their things.
This works ok and lets not spend any time to do this now.
I created https://github.com/sitespeedio/browsertime/issues/2198 for this issue, I think we can close our version
These tests has been removed.
I don't think is prio.
We have much more space now running Graphite in a bare metal server.
This is the same as T373172
This is not an issue now with the bare metal servers where we have full control.
This has been fixed over time. I don't know exactly when but its not occurring now.
This has been fixed when we moved to bare metal server tests.
@Jdlrobson I agree, I think we can close this as declined and then we can open up a new issue if we see the same with 2022?
I've been talking to the Mozilla performance team to try to get some help. This is what I know right now: The regression is only visible when we run tests in a Docker container and only visible in our direct tests. The regression slows down many metrics like for example TTFB, FCP and LCP. However there's no difference in the CPU benchmark.
Thu, Nov 14
I've have some input:
Hi, let me know if you want to discuss it! I'll share some thoughts here:
This is finally done. The new server crux-metrics.webperformancetest.eqiad1.wikimedia.cloud and documentation has been updated. I have problems with VPS cloud that when I create a new instance, sometimes I cannot login to it. I recreate, the same. I give it a new name, it works! I recreate with the old, it fails. So something going on there, I had the problem before. Need to report it though.
Wed, Nov 13
Hi @gabriel-wmde thanks for the investigation. I agree with you that it doesn't look like quick fix and I like your idea that we all need to work together.
I think this is my fault with the change of removing -it? I'm also thinking it could be fixed by adding an env variable that adds -it to the Selenium container?
I'm thinking we should wait with that but lets loop in @zeljkofilipin
Tue, Nov 12
I also created a bug for the net log on Ubuntu for Firefox that doesn't seem to work.
Mon, Nov 11
Thank you!
Sat, Nov 9
Fri, Nov 8
Thanks for the fix, I could see the Selenium tests running now.
Hi @gabriel-wmde , sorry for being slow on this. Ok, let me explain the performance tests:
I've pinged the Mozilla performance team about it. Checking the changelog for 132 I could see anything obvious.
Thu, Nov 7
This has been implemented and documented https://wikitech.wikimedia.org/wiki/Performance/Synthetic_testing/Run_a_test
I added two jobs in https://github.com/soulgalore/mediawiki-quickstart-test/actions as a test. @Mhurd and @zeljkofilipin
you are invited. The ci job is stuck on rm permissions. The fresh install seems to work, but lets add some tests to verify it.
All webpagereplay and direct test now links to https://wikitech.wikimedia.org/wiki/Performance/Guides/Regressions
I made a test release to verify that it really works.
This is done and the documentation is updated. Lets sync with @Mhurd before we communicate with the web team.
@Mhurd I've added sections for updates and revert version, please check when you have time that its correct.