Testing and evaluation is a critical step in the development and deployment of connected and automated vehicles (CAVs), and yet there are no systematic ways to generate testing scenarios for evaluating CAVs. The goal of this paper is to develop a general framework that can be used to construct a testing scenario library and evaluate CAVs with the library. To this end, four research problems are identified, and a unified framework is proposed. A hierarchical structure is designed to describe testing scenarios, namely functional scenarios, logical scenarios and specific scenarios. Functional and logical scenarios are identified by investigating the crash typologies from crash databases. A new concept named valuable region is proposed to determine specific scenarios, and a searching method based on optimization and seed-fill is formulated to find the valuable region. A CAV evaluation method using the generated library is proposed, which includes a ε-greedy sampling policy, augmented reality testing environment and index evaluation. Different from previous studies that only focus on evaluating safety, a new index named functionality is proposed to evaluate the efficiency of a CAV in completing different driving tasks. Two case studies are presented to validate the proposed framework and provide guidelines for implementation.