
Parallel two-stage reduction to Hessenberg form using dynamic scheduling on shared-memory architectures
Keywords: Hessenberg reduction; Blocked algorithm; Parallel computing; Dynamic scheduling; High performance; Multi-core; Memory hierarchies