As others have stated, rendering multiple view ports, from different perspectives, is more demanding than rendering one, even if the total pixel count is similar. Mirroring one eye to another screen is very low load, but doing the views of the two separate eyes in the first place is not.
CPU load of the render threads has almost nothing to do with number of pixels being drawn, total number of draw calls and/or the number of vertices/primitives/fragments is usually the overriding factor. Even the GPU side of things can be geometry limited, or be subject to severe cache contention, even if raw fill rate never becomes an issue.
High performance stereo rendering for VR Timothy Wilson San Diego Virtual Reality Meetup January 20, 2015
docs.google.com
Figuring out what approach the game is taking to VR rendering is the first step in deciding what to do to improve performance. This could be inferred, with some effort, from observing performance and resource utilization as various settings are manipulated. Ideally, you'd only be limited by draw calls and number of verticies, and could increase CPU speed and/or selectively reduce geometry load (turning down the view distance scale and terrain quality) to improve VR performance while sacrificing as little else as possible.
Also, looking at aggregate CPU load doesn't say much. If you want a clear picture of CPU load, per-core utilization, preferably with a short enough polling interval to catch relevant spikes, is required. Very near maximum GPU load is usually a strong contraindication to a CPU limitation, but there can be exceptions if GPU is not pegged at 99%+.