30 of the 44 regions chosen to be included in the ENCODE experiment were selected randomly from
groups pre-classified by non-exonic conservation and gene density.
Although there were more randomly selected regions than manually selected regions, the two sets
of regions were of approximately the same size. All the randomly selected regions were 500KB, while the
manually selected regions were of different sizes and some were up to 2MB in size.
Of the 1097 CDS sequences, 661 were from regions selected manually and 436 from "randomly" selected
regions. So while it is true that "randomly" chosen regions contained less CDS than those chosen for
their biological interest,
no extrapolations can be made from this comparison, due to the nature of the "random" selection process.
Regions ENr112 (chromosome 2), ENr311 (chromosome 14), ENr313 (chromosome 16) do not have
any CDS sequences, while regions ENr113, ENr114, ENr211, ENr213, and ENr312 have just one sequence
plus alternative splicing varaints.
In contrast none of the manually chosen regions has less than 2 sequences plus splice isoforms. Region ENm012 has the fewest (just 4 sequences), while ENm006 has 118 sequences.