Last week I was talking about a nasty bug that took a while to solve. While that one was in progress, I had another one going on simulateously that was equally challenging.
We were doing some relatively simple load tests and our app would break. The result was lots of socket errors, usually 10055. There are various articles on this but none of them solved my problem. I was seeing something different.
However, oddly, we had some boxes that this would work on. So the challenge became…what the difference? I started going through registry settings, network configurations, services, hardware, etc. Finally, I found one Windows article that really shed some light on the issue. In this it talks about the effects of the /3GB switch in the boot.ini file. What’s that all about? The article starts explaining how when this switch is in use, various aspects of kernel memory allocation are reduced. The effects of the reduced page table entries sounds exactly like the problem! So I removed this and rebooted and…it worked!
This article talks about using a /userva switch in addition to /3GB to further fine tweak the memory usage.
I am trying to get better about blogging these types of issues that take a long time to figure out. Hoping that more links to the solution will help others get to the solution faster than I did in the future.