A very nice reference is "Computer Structures -- readings and examples" by Bell
and Newell (McGraw-Hill, 1971). There's a section in it about "Virtual-address
space and memory mapping" (page 77-80). It describes a whole lot of schemes and
lists early instances. For example, there is a scheme that has protection but not
relocation (IBM 1800 -- per word, and SDS Sigma-2 -- per page). There are schemes with a
base address and length (PDP-6, CDC 6000) or two of those (PDP-10). And there are page
mapping schemes (Atlas, CDC 3500) and segment descriptor table ones (Burroughs 5500). All
those are early to mid 1960s.
paul