Cover Image






 

 









University Physics
Volume 3








SENIOR CONTRIBUTING AUTHORS
SAMUEL J. LING, TRUMAN STATE UNIVERSITY
JEFF SANNY, LOYOLA MARYMOUNT UNIVERSITY
WILLIAM MOEBS, PHD









 

 


OpenStax
Rice University
6100 Main Street MS-375
Houston, Texas 77005

To learn more about OpenStax, visit https://openstax.org.
Individual print copies and bulk orders can be purchased through our website.

©2017 Rice University. Textbook content produced by OpenStax is licensed under a Creative Commons
Attribution 4.0 International License (CC BY 4.0). Under this license, any user of this textbook or the textbook
contents herein must provide proper attribution as follows:


- If you redistribute this textbook in a digital format (including but not limited to PDF and HTML), then you
must retain on every page the following attribution:
“Download for free at https://openstax.org/details/books/university-physics-volume-3.”


- If you redistribute this textbook in a print format, then you must include on every physical page the
following attribution:
“Download for free at https://openstax.org/details/books/university-physics-volume-3.”


- If you redistribute part of this textbook, then you must retain in every digital format page view (including
but not limited to PDF and HTML) and on every physical printed page the following attribution:
“Download for free at https://openstax.org/details/books/university-physics-volume-3.”


- If you use this textbook as a bibliographic reference, please include
https://openstax.org/details/books/university-physics-volume-3 in your citation.



For questions regarding this licensing, please contact support@openstax.org.

Trademarks
The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, OpenStax CNX logo,
OpenStax Tutor name, Openstax Tutor logo, Connexions name, Connexions logo, Rice University name, and
Rice University logo are not subject to the license and may not be reproduced without the prior and express
written consent of Rice University.

PRINT BOOK ISBN-10 1-938168-18-6
PRINT BOOK ISBN-13 978-1-938168-18-5
PDF VERSION ISBN-10 1-947172-22-0
PDF VERSION ISBN-13 978-1-947172-22-7
ENHANCED TEXTBOOK ISBN-10 1-947172-22-0
ENHANCED TEXTBOOK ISBN-13 978-1-947172-22-7
Revision Number UP3-2016-001(03/17)-BW
Original Publication Year 2016




 

 


OPENSTAX
OpenStax provides free, peer-reviewed, openly licensed textbooks for introductory college and Advanced Placement®
courses and low-cost, personalized courseware that helps students learn. A nonprofit ed tech initiative based at Rice
University, we’re committed to helping students access the tools they need to complete their courses and meet their
educational goals.



RICE UNIVERSITY
OpenStax, OpenStax CNX, and OpenStax Tutor are initiatives of Rice University. As a leading research university with a
distinctive commitment to undergraduate education, Rice University aspires to path-breaking research, unsurpassed
teaching, and contributions to the betterment of our world. It seeks to fulfill this mission by cultivating a diverse community
of learning and discovery that produces leaders across the spectrum of human endeavor.





FOUNDATION SUPPORT

OpenStax is grateful for the tremendous support of our sponsors. Without their strong engagement, the goal
of free access to high-quality textbooks would remain just a dream.


Laura and John Arnold Foundation (LJAF) actively seeks opportunities to invest in organizations and
thought leaders that have a sincere interest in implementing fundamental changes that not only
yield immediate gains, but also repair broken systems for future generations. LJAF currently focuses
its strategic investments on education, criminal justice, research integrity, and public accountability.



The William and Flora Hewlett Foundation has been making grants since 1967 to help solve social
and environmental problems at home and around the world. The Foundation concentrates its
resources on activities in education, the environment, global development and population,
performing arts, and philanthropy, and makes grants to support disadvantaged communities in the
San Francisco Bay Area.

Calvin K. Kazanjian was the founder and president of Peter Paul (Almond Joy), Inc. He firmly believed
that the more people understood about basic economics the happier and more prosperous they
would be. Accordingly, he established the Calvin K. Kazanjian Economics Foundation Inc, in 1949 as a
philanthropic, nonpolitical educational organization to support efforts that enhanced economic
understanding.



Guided by the belief that every life has equal value, the Bill & Melinda Gates Foundation works to
help all people lead healthy, productive lives. In developing countries, it focuses on improving
people’s health with vaccines and other life-saving tools and giving them the chance to lift
themselves out of hunger and extreme poverty. In the United States, it seeks to significantly
improve education so that all young people have the opportunity to reach their full potential. Based
in Seattle, Washington, the foundation is led by CEO Jeff Raikes and Co-chair William H. Gates Sr.,
under the direction of Bill and Melinda Gates and Warren Buffett.

The Maxfield Foundation supports projects with potential for high impact in science, education,
sustainability, and other areas of social importance.



Our mission at The Michelson 20MM Foundation is to grow access and success by eliminating
unnecessary hurdles to affordability. We support the creation, sharing, and proliferation of more
effective, more affordable educational content by leveraging disruptive technologies, open
educational resources, and new models for collaboration between for-profit, nonprofit, and public
entities.


The Bill and Stephanie Sick Fund supports innovative projects in the areas of Education, Art, Science
and Engineering.





IT’S INNOVATION IN EDUCATION. A
HII PENSTAX
CII TUDENTS FREE
TE MEET SCOPE AND
SE QUIREMENTS FOR
MII URSES. THESE ARE
PEER-REVIEWED TEXTS WRITTEN BY
PROFESSIONAL CONTENT A
DEVELOPERS. ADOPT A BOOK
TODAY FOR A TURNKEY
CLASSROOM SOLUTION OR MODIFY
IT TO SUIT YOUR TEACHING
APPROACH. FREE ONLINE AND
LOW-COST IN PRINT, OPENSTA X


WOU L DN’T T H IS
LOOK BETTER
ON A B R A N D


MINI?


Knowing where our textbooks are used can
help us provide better services to students and
receive more grant support for future projects.


If you’re using an OpenStax textbook, either as
required for your course or just as an


extra resource, send your course syllabus to
contests@openstax.org and you’ll


be entered to win an iPad Mini.


If you don’t win, don’t worry – we’ll be
holding a new contest each semester.


N E W I P A D




Table of ContentsPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1Unit 1. OpticsChapter 1: The Nature of Light . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.1 The Propagation of Light . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.2 The Law of Reflection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121.3 Refraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151.4 Total Internal Reflection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191.5 Dispersion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241.6 Huygens’s Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281.7 Polarization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33Chapter 2: Geometric Optics and Image Formation . . . . . . . . . . . . . . . . . . . . . . . 532.1 Images Formed by Plane Mirrors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 542.2 Spherical Mirrors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 562.3 Images Formed by Refraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 672.4 Thin Lenses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 702.5 The Eye . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 822.6 The Camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 892.7 The Simple Magnifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 912.8 Microscopes and Telescopes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94Chapter 3: Interference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1173.1 Young's Double-Slit Interference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1173.2 Mathematics of Interference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1213.3 Multiple-Slit Interference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1243.4 Interference in Thin Films . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1263.5 The Michelson Interferometer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132Chapter 4: Diffraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1454.1 Single-Slit Diffraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1464.2 Intensity in Single-Slit Diffraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1504.3 Double-Slit Diffraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1554.4 Diffraction Gratings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1574.5 Circular Apertures and Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1624.6 X-Ray Diffraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1684.7 Holography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170Unit 2. Modern PhysicsChapter 5: Relativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1835.1 Invariance of Physical Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1845.2 Relativity of Simultaneity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1865.3 Time Dilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1895.4 Length Contraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1995.5 The Lorentz Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2045.6 Relativistic Velocity Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2145.7 Doppler Effect for Light . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2185.8 Relativistic Momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2215.9 Relativistic Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223Chapter 6: Photons and Matter Waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2456.1 Blackbody Radiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2466.2 Photoelectric Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2546.3 The Compton Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2606.4 Bohr’s Model of the Hydrogen Atom . . . . . . . . . . . . . . . . . . . . . . . . . . . 2656.5 De Broglie’s Matter Waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2756.6 Wave-Particle Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283Chapter 7: Quantum Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3017.1 Wave Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3027.2 The Heisenberg Uncertainty Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . 3137.3 The Schrӧdinger Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3167.4 The Quantum Particle in a Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3197.5 The Quantum Harmonic Oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325




7.6 The Quantum Tunneling of Particles through Potential Barriers . . . . . . . . . . . . . 330Chapter 8: Atomic Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3538.1 The Hydrogen Atom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3548.2 Orbital Magnetic Dipole Moment of the Electron . . . . . . . . . . . . . . . . . . . . . 3638.3 Electron Spin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3688.4 The Exclusion Principle and the Periodic Table . . . . . . . . . . . . . . . . . . . . . . 3728.5 Atomic Spectra and X-rays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3788.6 Lasers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389Chapter 9: Condensed Matter Physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4019.1 Types of Molecular Bonds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4029.2 Molecular Spectra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4079.3 Bonding in Crystalline Solids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4109.4 Free Electron Model of Metals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4179.5 Band Theory of Solids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4229.6 Semiconductors and Doping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4259.7 Semiconductor Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4289.8 Superconductivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 434Chapter 10: Nuclear Physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45110.1 Properties of Nuclei . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45210.2 Nuclear Binding Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45710.3 Radioactive Decay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46010.4 Nuclear Reactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46710.5 Fission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47510.6 Nuclear Fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48110.7 Medical Applications and Biological Effects of Nuclear Radiation . . . . . . . . . . . . 486Chapter 11: Particle Physics and Cosmology . . . . . . . . . . . . . . . . . . . . . . . . . . 50511.1 Introduction to Particle Physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50611.2 Particle Conservation Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51011.3 Quarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51511.4 Particle Accelerators and Detectors . . . . . . . . . . . . . . . . . . . . . . . . . . . 51911.5 The Standard Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52711.6 The Big Bang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53211.7 Evolution of the Early Universe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 536Appendix A: Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553Appendix B: Conversion Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 557Appendix C: Fundamental Constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 561Appendix D: Astronomical Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 563Appendix E: Mathematical Formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 565Appendix F: Chemistry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 569Appendix G: The Greek Alphabet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 571Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 603


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




PREFACE
Welcome to University Physics, an OpenStax resource. This textbook was written to increase student access to high-quality learning materials, maintaining highest standards of academic rigor at little to no cost.
About OpenStax
OpenStax is a nonprofit based at Rice University, and it’s our mission to improve student access to education. Our first openly licensed college textbook was published in 2012 and our library has since scaled to over 20 books used by hundreds of thousands of students across the globe. Our adaptive learning technology, designed to improve learning outcomes through personalized educational paths, is currently being piloted for K–12 and college. The OpenStax mission is made possible through the generous support of philanthropic foundations. Through these partnerships and with the help of additional low-cost resources from our OpenStax partners, OpenStax is breaking down the most common barriers to learning and empowering students and instructors to succeed.
About OpenStax ResourcesCustomization
University Physics is licensed under a Creative Commons Attribution 4.0 International (CC BY) license, which means that you can distribute, remix, and build upon the content, as long as you provide attribution to OpenStax and its content contributors.
Because our books are openly licensed, you are free to use the entire book or pick and choose the sections that are most relevant to the needs of your course. Feel free to remix the content by assigning your students certain chapters and sections in your syllabus in the order that you prefer. You can even provide a direct link in your syllabus to the sections in the web view of your book.
Faculty also have the option of creating a customized version of their OpenStax book through the OpenStax Custom platform. The custom version can be made available to students in low-cost print or digital form through their campus bookstore. Visit your book page on openstax.org for a link to your book on OpenStax Custom.
Errata
All OpenStax textbooks undergo a rigorous review process. However, like any professional-grade textbook, errors sometimes occur. Since our books are web based, we can make updates periodically when deemed pedagogically necessary. If you have a correction to suggest, submit it through the link on your book page on openstax.org. Subject matter experts review all errata suggestions. OpenStax is committed to remaining transparent about all updates, so you will also find a list of past errata changes on your book page on openstax.org.
Format
You can access this textbook for free in web view or PDF through openstax.org, and for a low cost in print.
About University Physics
University Physics is designed for the two- or three-semester calculus-based physics course. The text has been developed to meet the scope and sequence of most university physics courses and provides a foundation for a career in mathematics, science, or engineering. The book provides an important opportunity for students to learn the core concepts of physics and understand how those concepts apply to their lives and to the world around them.
Due to the comprehensive nature of the material, we are offering the book in three volumes for flexibility and efficiency.
Coverage and Scope
Our University P hysics textbook adheres to the scope and sequence of most two- and three-semester physics courses nationwide. We have worked to make physics interesting and accessible to students while maintaining the mathematical rigor inherent in the subject. With this objective in mind, the content of this textbook has been developed and arranged to provide a logical progression from fundamental to more advanced concepts, building upon what students have already learned and emphasizing connections between topics and between theory and applications. The goal of each section is to enable students not just to recognize concepts, but to work with them in ways that will be useful in later courses and future careers. The organization and pedagogical features were developed and vetted with feedback from science educators dedicated to the project.


Preface 1




VOLUME I
Unit 1: Mechanics


Chapter 1: Units and Measurement
Chapter 2: Vectors
Chapter 3: Motion Along a Straight Line
Chapter 4: Motion in Two and Three Dimensions
Chapter 5: Newton’s Laws of Motion
Chapter 6: Applications of Newton’s Laws
Chapter 7: Work and Kinetic Energy
Chapter 8: Potential Energy and Conservation of Energy
Chapter 9: Linear Momentum and Collisions
Chapter 10: Fixed-Axis Rotation
Chapter 11: Angular Momentum
Chapter 12: Static Equilibrium and Elasticity
Chapter 13: Gravitation
Chapter 14: Fluid Mechanics


Unit 2: Waves and Acoustics
Chapter 15: Oscillations
Chapter 16: Waves
Chapter 17: Sound


VOLUME II
Unit 1: Thermodynamics


Chapter 1: Temperature and Heat
Chapter 2: The Kinetic Theory of Gases
Chapter 3: The First Law of Thermodynamics
Chapter 4: The Second Law of Thermodynamics


Unit 2: Electricity and Magnetism
Chapter 5: Electric Charges and Fields
Chapter 6: Gauss’s Law
Chapter 7: Electric Potential
Chapter 8: Capacitance
Chapter 9: Current and Resistance
Chapter 10: Direct-Current Circuits
Chapter 11: Magnetic Forces and Fields
Chapter 12: Sources of Magnetic Fields
Chapter 13: Electromagnetic Induction
Chapter 14: Inductance
Chapter 15: Alternating-Current Circuits
Chapter 16: Electromagnetic Waves


VOLUME III
Unit 1: Optics


Chapter 1: The Nature of Light


2 Preface


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Chapter 2: Geometric Optics and Image Formation
Chapter 3: Interference
Chapter 4: Diffraction


Unit 2: Modern Physics
Chapter 5: Relativity
Chapter 6: Photons and Matter Waves
Chapter 7: Quantum Mechanics
Chapter 8: Atomic Structure
Chapter 9: Condensed Matter Physics
Chapter 10: Nuclear Physics
Chapter 11: Particle Physics and Cosmology


Pedagogical Foundation
Throughout University Physics you will find derivations of concepts that present classical ideas and techniques, as wellas modern applications and methods. Most chapters start with observations or experiments that place the material in acontext of physical experience. Presentations and explanations rely on years of classroom experience on the part of long-time physics professors, striving for a balance of clarity and rigor that has proven successful with their students. Throughoutthe text, links enable students to review earlier material and then return to the present discussion, reinforcing connectionsbetween topics. Key historical figures and experiments are discussed in the main text (rather than in boxes or sidebars),maintaining a focus on the development of physical intuition. Key ideas, definitions, and equations are highlighted inthe text and listed in summary form at the end of each chapter. Examples and chapter-opening images often includecontemporary applications from daily life or modern science and engineering that students can relate to, from smart phonesto the internet to GPS devices.
Assessments That Reinforce Key Concepts
In-chapter Examples generally follow a three-part format of Strategy, Solution, and Significance to emphasize how toapproach a problem, how to work with the equations, and how to check and generalize the result. Examples are oftenfollowed by Check Your Understanding questions and answers to help reinforce for students the important ideas of theexamples. Problem-Solving Strategies in each chapter break down methods of approaching various types of problems intosteps students can follow for guidance. The book also includes exercises at the end of each chapter so students can practicewhat they’ve learned.


Conceptual questions do not require calculation but test student learning of the key concepts.
Problems categorized by section test student problem-solving skills and the ability to apply ideas to practicalsituations.
Additional Problems apply knowledge across the chapter, forcing students to identify what concepts and equationsare appropriate for solving given problems. Randomly located throughout the problems are Unreasonable Resultsexercises that ask students to evaluate the answer to a problem and explain why it is not reasonable and whatassumptions made might not be correct.
Challenge Problems extend text ideas to interesting but difficult situations.


Answers for selected exercises are available in an Answer Key at the end of the book.
Additional ResourcesStudent and Instructor Resources
We’ve compiled additional resources for both students and instructors, including Getting Started Guides, PowerPoint slides,and answer and solution guides for instructors and students. Instructor resources require a verified instructor account, whichcan be requested on your openstax.org log-in. Take advantage of these resources to supplement your OpenStax book.
Partner Resources
OpenStax partners are our allies in the mission to make high-quality learning materials affordable and accessible to studentsand instructors everywhere. Their tools integrate seamlessly with our OpenStax titles at a low cost. To access the partnerresources for your text, visit your book page on openstax.org.


Preface 3




About the AuthorsSenior Contributing Authors
Samuel J. Ling, Truman State UniversityDr. Samuel Ling has taught introductory and advanced physics for over 25 years at Truman State University, where he iscurrently Professor of Physics and the Department Chair. Dr. Ling has two PhDs from Boston University, one in Chemistryand the other in Physics, and he was a Research Fellow at the Indian Institute of Science, Bangalore, before joining Truman.Dr. Ling is also an author of A First Course in Vibrations and Waves, published by Oxford University Press. Dr. Ling hasconsiderable experience with research in Physics Education and has published research on collaborative learning methods inphysics teaching. He was awarded a Truman Fellow and a Jepson fellow in recognition of his innovative teaching methods.Dr. Ling’s research publications have spanned Cosmology, Solid State Physics, and Nonlinear Optics.
Jeff Sanny, Loyola Marymount UniversityDr. Jeff Sanny earned a BS in Physics from Harvey Mudd College in 1974 and a PhD in Solid State Physics from theUniversity of California–Los Angeles in 1980. He joined the faculty at Loyola Marymount University in the fall of 1980.During his tenure, he has served as department Chair as well as Associate Dean. Dr. Sanny enjoys teaching introductoryphysics in particular. He is also passionate about providing students with research experience and has directed an activeundergraduate student research group in space physics for many years.
Bill Moebs, PhDDr. William Moebs earned a BS and PhD (1959 and 1965) from the University of Michigan. He then joined their staffas a Research Associate for one year, where he continued his doctoral research in particle physics. In 1966, he acceptedan appointment to the Physics Department of Indiana Purdue Fort Wayne (IPFW), where he served as Department Chairfrom 1971 to 1979. In 1979, he moved to Loyola Marymount University (LMU), where he served as Chair of the PhysicsDepartment from 1979 to 1986. He retired from LMU in 2000. He has published research in particle physics, chemicalkinetics, cell division, atomic physics, and physics teaching.
Contributing Authors
David Anderson, Albion CollegeDaniel Bowman, Ferrum CollegeDedra Demaree, Georgetown UniversityGerald Friedman, Santa Fe Community CollegeLev Gasparov, University of North FloridaEdw. S. Ginsberg, University of MassachusettsAlice Kolakowska, University of MemphisLee LaRue, Paris Junior CollegeMark Lattery, University of WisconsinRichard Ludlow, Daniel Webster CollegePatrick Motl, Indiana University–KokomoTao Pang, University of Nevada–Las VegasKenneth Podolak, Plattsburgh State UniversityTakashi Sato, Kwantlen Polytechnic UniversityDavid Smith, University of the Virgin IslandsJoseph Trout, Richard Stockton CollegeKevin Wheelock, Bellevue College
Reviewers
Salameh Ahmad, Rochester Institute of Technology–DubaiJohn Aiken, University of Colorado–BoulderAnand Batra, Howard UniversityRaymond Benge, Terrant County CollegeGavin Buxton, Robert Morris UniversityErik Christensen, South Florida State CollegeClifton Clark, Fort Hays State UniversityNelson Coates, California Maritime AcademyHerve Collin, Kapi’olani Community CollegeCarl Covatto, Arizona State UniversityAlexander Cozzani, Imperial Valley CollegeDanielle Dalafave, The College of New JerseyNicholas Darnton, Georgia Institute of Technology


4 Preface


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Robert Edmonds, Tarrant County CollegeWilliam Falls, Erie Community CollegeStanley Forrester, Broward CollegeUmesh Garg, University of Notre DameMaurizio Giannotti, Barry UniversityBryan Gibbs, Dallas County Community CollegeMark Giroux, East Tennessee State UniversityMatthew Griffiths, University of New HavenAlfonso Hinojosa, University of Texas–ArlingtonSteuard Jensen, Alma CollegeDavid Kagan, University of MassachusettsJill Leggett, Florida State College–JacksonvilleSergei Katsev, University of Minnesota–DuluthAlfredo Louro, University of CalgaryJames Maclaren, Tulane UniversityPonn Maheswaranathan, Winthrop UniversitySeth Major, Hamilton CollegeOleg Maksimov, Excelsior CollegeAristides Marcano, Delaware State UniversityMarles McCurdy, Tarrant County CollegeJames McDonald, University of HartfordRalph McGrew, SUNY–Broome Community CollegePaul Miller, West Virginia UniversityTamar More, University of PortlandFarzaneh Najmabadi, University of PhoenixRichard Olenick, The University of DallasChristopher Porter, Ohio State UniversityLiza Pujji, Manakau Institute of TechnologyBaishali Ray, Young Harris UniversityAndrew Robinson, Carleton UniversityAruvana Roy, Young Harris UniversityAbhijit Sarkar, The Catholic University of AmericaGajendra Tulsian, Daytona State CollegeAdria Updike, Roger Williams UniversityClark Vangilder, Central Arizona UniversitySteven Wolf, Texas State UniversityAlexander Wurm, Western New England UniversityLei Zhang, Winston Salem State UniversityUlrich Zurcher, Cleveland State University


Preface 5




Preface


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




1 | THE NATURE OF LIGHT


Figure 1.1 Due to total internal reflection, an underwater swimmer’s image is reflected back into the water where the camera islocated. The circular ripple in the image center is actually on the water surface. Due to the viewing angle, total internal reflectionis not occurring at the top edge of this image, and we can see a view of activities on the pool deck. (credit: modification of workby “jayhem”/Flickr)


Chapter Outline
1.1 The Propagation of Light
1.2 The Law of Reflection
1.3 Refraction
1.4 Total Internal Reflection
1.5 Dispersion
1.6 Huygens’s Principle
1.7 Polarization


Introduction
Our investigation of light revolves around two questions of fundamental importance: (1) What is the nature of light, and (2)how does light behave under various circumstances? Answers to these questions can be found in Maxwell’s equations (inElectromagnetic Waves (http://cnx.org/content/m58495/latest/) ), which predict the existence of electromagneticwaves and their behavior. Examples of light include radio and infrared waves, visible light, ultraviolet radiation, and X-rays.Interestingly, not all light phenomena can be explained by Maxwell’s theory. Experiments performed early in the twentiethcentury showed that light has corpuscular, or particle-like, properties. The idea that light can display both wave and particlecharacteristics is called wave-particle duality, which is examined in Photons and Matter Waves.
In this chapter, we study the basic properties of light. In the next few chapters, we investigate the behavior of light when itinteracts with optical devices such as mirrors, lenses, and apertures.


Chapter 1 | The Nature of Light 7




1.1 | The Propagation of Light
Learning Objectives


By the end of this section, you will be able to:
• Determine the index of refraction, given the speed of light in a medium
• List the ways in which light travels from a source to another location


The speed of light in a vacuum c is one of the fundamental constants of physics. As you will see when you reach Relativity,it is a central concept in Einstein’s theory of relativity. As the accuracy of the measurements of the speed of light improved,it was found that different observers, even those moving at large velocities with respect to each other, measure the samevalue for the speed of light. However, the speed of light does vary in a precise manner with the material it traverses. Thesefacts have far-reaching implications, as we will see in later chapters.
The Speed of Light: Early Measurements
The first measurement of the speed of light was made by the Danish astronomer Ole Roemer (1644–1710) in 1675. Hestudied the orbit of Io, one of the four large moons of Jupiter, and found that it had a period of revolution of 42.5 h aroundJupiter. He also discovered that this value fluctuated by a few seconds, depending on the position of Earth in its orbit aroundthe Sun. Roemer realized that this fluctuation was due to the finite speed of light and could be used to determine c.
Roemer found the period of revolution of Io by measuring the time interval between successive eclipses by Jupiter. Figure1.2(a) shows the planetary configurations when such a measurement is made from Earth in the part of its orbit where itis receding from Jupiter. When Earth is at point A, Earth, Jupiter, and Io are aligned. The next time this alignment occurs,Earth is at point B, and the light carrying that information to Earth must travel to that point. Since B is farther from Jupiterthan A, light takes more time to reach Earth when Earth is at B. Now imagine it is about 6 months later, and the planetsare arranged as in part (b) of the figure. The measurement of Io’s period begins with Earth at point A′ and Io eclipsed by
Jupiter. The next eclipse then occurs when Earth is at point B′ , to which the light carrying the information of this eclipse
must travel. Since B′ is closer to Jupiter than A′ , light takes less time to reach Earth when it is at B′ . This time interval
between the successive eclipses of Io seen at A′ and B′ is therefore less than the time interval between the eclipses seen
at A and B. By measuring the difference in these time intervals and with appropriate knowledge of the distance between
Jupiter and Earth, Roemer calculated that the speed of light was 2.0 × 108 m/s, which is 33% below the value accepted
today.


Figure 1.2 Roemer’s astronomical method for determining the speed of light. Measurements of Io’s perioddone with the configurations of parts (a) and (b) differ, because the light path length and associated travel timeincrease from A to B (a) but decrease from A′ to B′ (b).


The first successful terrestrial measurement of the speed of light was made by Armand Fizeau (1819–1896) in 1849. Heplaced a toothed wheel that could be rotated very rapidly on one hilltop and a mirror on a second hilltop 8 km away (Figure1.3). An intense light source was placed behind the wheel, so that when the wheel rotated, it chopped the light beam intoa succession of pulses. The speed of the wheel was then adjusted until no light returned to the observer located behind thewheel. This could only happen if the wheel rotated through an angle corresponding to a displacement of (n + ½) teeth,


8 Chapter 1 | The Nature of Light


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




while the pulses traveled down to the mirror and back. Knowing the rotational speed of the wheel, the number of teeth on
the wheel, and the distance to the mirror, Fizeau determined the speed of light to be 3.15 × 108 m/s, which is only 5%
too high.


Figure 1.3 Fizeau’s method for measuring the speed of light. The teeth of thewheel block the reflected light upon return when the wheel is rotated at a rate thatmatches the light travel time to and from the mirror.


The French physicist Jean Bernard Léon Foucault (1819–1868) modified Fizeau’s apparatus by replacing the toothed wheel
with a rotating mirror. In 1862, he measured the speed of light to be 2.98 × 108 m/s, which is within 0.6% of the presently
accepted value. Albert Michelson (1852–1931) also used Foucault’s method on several occasions to measure the speed oflight. His first experiments were performed in 1878; by 1926, he had refined the technique so well that he found c to be
(2.99796 ± 4) × 108 m/s.


Today, the speed of light is known to great precision. In fact, the speed of light in a vacuum c is so important that it isaccepted as one of the basic physical quantities and has the value


(1.1)c = 2.99792458 × 108 m/s ≈ 3.00 × 108 m/s


where the approximate value of 3.00 × 108 m/s is used whenever three-digit accuracy is sufficient.
Speed of Light in Matter
The speed of light through matter is less than it is in a vacuum, because light interacts with atoms in a material. The speedof light depends strongly on the type of material, since its interaction varies with different atoms, crystal lattices, and othersubstructures. We can define a constant of a material that describes the speed of light in it, called the index of refraction n:


(1.2)n = cv


where v is the observed speed of light in the material.
Since the speed of light is always less than c in matter and equals c only in a vacuum, the index of refraction is alwaysgreater than or equal to one; that is, n ≥ 1 . Table 1.1 gives the indices of refraction for some representative substances.
The values are listed for a particular wavelength of light, because they vary slightly with wavelength. (This can have


Chapter 1 | The Nature of Light 9




important effects, such as colors separated by a prism, as we will see in Dispersion.) Note that for gases, n is close to1.0. This seems reasonable, since atoms in gases are widely separated, and light travels at c in the vacuum between atoms.It is common to take n = 1 for gases unless great precision is needed. Although the speed of light v in a medium varies
considerably from its value c in a vacuum, it is still a large speed.


Medium n
Gases at 0°C , 1 atm


Air 1.000293
Carbon dioxide 1.00045
Hydrogen 1.000139
Oxygen 1.000271


Liquids at 20°C
Benzene 1.501
Carbon disulfide 1.628
Carbon tetrachloride 1.461
Ethanol 1.361
Glycerine 1.473
Water, fresh 1.333


Solids at 20°C
Diamond 2.419
Fluorite 1.434
Glass, crown 1.52
Glass, flint 1.66
Ice (at 0°C) 1.309
Polystyrene 1.49
Plexiglas 1.51
Quartz, crystalline 1.544
Quartz, fused 1.458
Sodium chloride 1.544
Zircon 1.923


Table 1.1 Index of Refraction inVarious Media For light with awavelength of 589 nm in a vacuum


Example 1.1
Speed of Light in Jewelry
Calculate the speed of light in zircon, a material used in jewelry to imitate diamond.
Strategy
We can calculate the speed of light in a material v from the index of refraction n of the material, using the equation
n = c/v.


10 Chapter 1 | The Nature of Light


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




1.1


Solution
Rearranging the equation n = c/v for v gives us


v = cn.


The index of refraction for zircon is given as 1.923 in Table 1.1, and c is given in Equation 1.1. Entering thesevalues in the equation gives
v = 3.00 × 10


8 m/s
1.923


= 1.56 × 108 m/s.


Significance
This speed is slightly larger than half the speed of light in a vacuum and is still high compared with speeds wenormally experience. The only substance listed in Table 1.1 that has a greater index of refraction than zircon isdiamond. We shall see later that the large index of refraction for zircon makes it sparkle more than glass, but lessthan diamond.


Check Your Understanding Table 1.1 shows that ethanol and fresh water have very similar indices ofrefraction. By what percentage do the speeds of light in these liquids differ?


The Ray Model of Light
You have already studied some of the wave characteristics of light in the previous chapter on Electromagnetic Waves(http://cnx.org/content/m58495/latest/) . In this chapter, we start mainly with the ray characteristics. There are threeways in which light can travel from a source to another location (Figure 1.4). It can come directly from the source throughempty space, such as from the Sun to Earth. Or light can travel through various media, such as air and glass, to the observer.Light can also arrive after being reflected, such as by a mirror. In all of these cases, we can model the path of light as astraight line called a ray.


Figure 1.4 Three methods for light to travel from a source to another location. (a) Light reaches the upper atmosphere of Earth,traveling through empty space directly from the source. (b) Light can reach a person by traveling through media like air andglass. (c) Light can also reflect from an object like a mirror. In the situations shown here, light interacts with objects large enoughthat it travels in straight lines, like a ray.


Experiments show that when light interacts with an object several times larger than its wavelength, it travels in straight linesand acts like a ray. Its wave characteristics are not pronounced in such situations. Since the wavelength of visible light isless than a micron (a thousandth of a millimeter), it acts like a ray in the many common situations in which it encountersobjects larger than a micron. For example, when visible light encounters anything large enough that we can observe it withunaided eyes, such as a coin, it acts like a ray, with generally negligible wave characteristics.
In all of these cases, we can model the path of light as straight lines. Light may change direction when it encounters objects(such as a mirror) or in passing from one material to another (such as in passing from air to glass), but it then continues ina straight line or as a ray. The word “ray” comes from mathematics and here means a straight line that originates at some


Chapter 1 | The Nature of Light 11




point. It is acceptable to visualize light rays as laser rays. The ray model of light describes the path of light as straight lines.
Since light moves in straight lines, changing directions when it interacts with materials, its path is described by geometryand simple trigonometry. This part of optics, where the ray aspect of light dominates, is therefore called geometric optics.Two laws govern how light changes direction when it interacts with matter. These are the law of reflection, for situationsin which light bounces off matter, and the law of refraction, for situations in which light passes through matter. We willexamine more about each of these laws in upcoming sections of this chapter.
1.2 | The Law of Reflection


Learning Objectives
By the end of this section, you will be able to:
• Explain the reflection of light from polished and rough surfaces
• Describe the principle and applications of corner reflectors


Whenever we look into a mirror, or squint at sunlight glinting from a lake, we are seeing a reflection. When you look at apiece of white paper, you are seeing light scattered from it. Large telescopes use reflection to form an image of stars andother astronomical objects.
The law of reflection states that the angle of reflection equals the angle of incidence, or


(1.3)θr = θi


The law of reflection is illustrated in Figure 1.5, which also shows how the angle of incidence and angle of reflection aremeasured relative to the perpendicular to the surface at the point where the light ray strikes.


Figure 1.5 The law of reflection states that the angle ofreflection equals the angle of incidence— θr = θi. The angles
are measured relative to the perpendicular to the surface at thepoint where the ray strikes the surface.


We expect to see reflections from smooth surfaces, but Figure 1.6 illustrates how a rough surface reflects light. Since thelight strikes different parts of the surface at different angles, it is reflected in many different directions, or diffused. Diffusedlight is what allows us to see a sheet of paper from any angle, as shown in Figure 1.7(a). People, clothing, leaves, andwalls all have rough surfaces and can be seen from all sides. A mirror, on the other hand, has a smooth surface (comparedwith the wavelength of light) and reflects light at specific angles, as illustrated in Figure 1.7(b). When the Moon reflectsfrom a lake, as shown in Figure 1.7(c), a combination of these effects takes place.


12 Chapter 1 | The Nature of Light


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 1.6 Light is diffused when it reflects from a rough surface.Here, many parallel rays are incident, but they are reflected at manydifferent angles, because the surface is rough.


Figure 1.7 (a) When a sheet of paper is illuminated with many parallel incident rays, it can be seen at many different angles,because its surface is rough and diffuses the light. (b) A mirror illuminated by many parallel rays reflects them in only onedirection, because its surface is very smooth. Only the observer at a particular angle sees the reflected light. (c) Moonlight isspread out when it is reflected by the lake, because the surface is shiny but uneven. (credit c: modification of work by DiegoTorres Silvestre)


When you see yourself in a mirror, it appears that the image is actually behind the mirror (Figure 1.8). We see the lightcoming from a direction determined by the law of reflection. The angles are such that the image is exactly the same distancebehind the mirror as you stand in front of the mirror. If the mirror is on the wall of a room, the images in it are all behind themirror, which can make the room seem bigger. Although these mirror images make objects appear to be where they cannotbe (like behind a solid wall), the images are not figments of your imagination. Mirror images can be photographed andvideotaped by instruments and look just as they do with our eyes (which are optical instruments themselves). The precisemanner in which images are formed by mirrors and lenses is discussed in an upcoming chapter onGeometric Optics andImage Formation.


Chapter 1 | The Nature of Light 13




Figure 1.8 (a) Your image in a mirror is behind the mirror. The two rays shown are those that strike the mirror at just thecorrect angles to be reflected into the eyes of the person. The image appears to be behind the mirror at the same distance away as(b) if you were looking at your twin directly, with no mirror.
Corner Reflectors (Retroreflectors)
A light ray that strikes an object consisting of two mutually perpendicular reflecting surfaces is reflected back exactlyparallel to the direction from which it came (Figure 1.9). This is true whenever the reflecting surfaces are perpendicular,and it is independent of the angle of incidence. (For proof, see at the end of this section.) Such an object is called acorner reflector, since the light bounces from its inside corner. Corner reflectors are a subclass of retroreflectors, whichall reflect rays back in the directions from which they came. Although the geometry of the proof is much more complex,corner reflectors can also be built with three mutually perpendicular reflecting surfaces and are useful in three-dimensionalapplications.


Figure 1.9 A light ray that strikes two mutually perpendicularreflecting surfaces is reflected back exactly parallel to thedirection from which it came.


Many inexpensive reflector buttons on bicycles, cars, and warning signs have corner reflectors designed to return lightin the direction from which it originated. Rather than simply reflecting light over a wide angle, retroreflection ensureshigh visibility if the observer and the light source are located together, such as a car’s driver and headlights. The Apolloastronauts placed a true corner reflector on the Moon (Figure 1.10). Laser signals from Earth can be bounced from thatcorner reflector to measure the gradually increasing distance to the Moon of a few centimeters per year.


14 Chapter 1 | The Nature of Light


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 1.10 (a) Astronauts placed a corner reflector on the Moon to measure its gradually increasing orbital distance. (b) Thebright spots on these bicycle safety reflectors are reflections of the flash of the camera that took this picture on a dark night.(credit a: modification of work by NASA; credit b: modification of work by “Julo”/Wikimedia Commons)


Working on the same principle as these optical reflectors, corner reflectors are routinely used as radar reflectors (Figure1.11) for radio-frequency applications. Under most circumstances, small boats made of fiberglass or wood do not stronglyreflect radio waves emitted by radar systems. To make these boats visible to radar (to avoid collisions, for example), radarreflectors are attached to boats, usually in high places.


Figure 1.11 A radar reflector hoisted on a sailboat is a type ofcorner reflector. (credit: Tim Sheerman-Chase)


As a counterexample, if you are interested in building a stealth airplane, radar reflections should be minimized to evadedetection. One of the design considerations would then be to avoid building 90° corners into the airframe.
1.3 | Refraction


Learning Objectives
By the end of this section, you will be able to:
• Describe how rays change direction upon entering a medium
• Apply the law of refraction in problem solving


You may often notice some odd things when looking into a fish tank. For example, you may see the same fish appearing tobe in two different places (Figure 1.12). This happens because light coming from the fish to you changes direction when it


Chapter 1 | The Nature of Light 15




leaves the tank, and in this case, it can travel two different paths to get to your eyes. The changing of a light ray’s direction(loosely called bending) when it passes through substances of different refractive indices is called refraction and is relatedto changes in the speed of light, v = c/n . Refraction is responsible for a tremendous range of optical phenomena, from the
action of lenses to data transmission through optical fibers.


Figure 1.12 (a) Looking at the fish tank as shown, we can see the same fish in two different locations, because light changesdirections when it passes from water to air. In this case, the light can reach the observer by two different paths, so the fish seemsto be in two different places. This bending of light is called refraction and is responsible for many optical phenomena. (b) Thisimage shows refraction of light from a fish near the top of a fish tank.


Figure 1.13 shows how a ray of light changes direction when it passes from one medium to another. As before, the anglesare measured relative to a perpendicular to the surface at the point where the light ray crosses it. (Some of the incident lightis reflected from the surface, but for now we concentrate on the light that is transmitted.) The change in direction of the lightray depends on the relative values of the indices of refraction (The Propagation of Light) of the two media involved. Inthe situations shown, medium 2 has a greater index of refraction than medium 1. Note that as shown in Figure 1.13(a), thedirection of the ray moves closer to the perpendicular when it progresses from a medium with a lower index of refractionto one with a higher index of refraction. Conversely, as shown in Figure 1.13(b), the direction of the ray moves awayfrom the perpendicular when it progresses from a medium with a higher index of refraction to one with a lower index ofrefraction. The path is exactly reversible.


16 Chapter 1 | The Nature of Light


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 1.13 The change in direction of a light ray depends on how the index of refractionchanges when it crosses from one medium to another. In the situations shown here, the index ofrefraction is greater in medium 2 than in medium 1. (a) A ray of light moves closer to theperpendicular when entering a medium with a higher index of refraction. (b) A ray of lightmoves away from the perpendicular when entering a medium with a lower index of refraction.


The amount that a light ray changes its direction depends both on the incident angle and the amount that the speed changes.For a ray at a given incident angle, a large change in speed causes a large change in direction and thus a large changein angle. The exact mathematical relationship is the law of refraction, or Snell’s law, after the Dutch mathematicianWillebrord Snell (1591–1626), who discovered it in 1621. The law of refraction is stated in equation form as


(1.4)n1 sin θ1 = n2 sin θ2.


Here n1 and n2 are the indices of refraction for media 1 and 2, and θ1 and θ2 are the angles between the rays and the
perpendicular in media 1 and 2. The incoming ray is called the incident ray, the outgoing ray is called the refracted ray, andthe associated angles are the incident angle and the refracted angle, respectively.
Snell’s experiments showed that the law of refraction is obeyed and that a characteristic index of refraction n could beassigned to a given medium and its value measured. Snell was not aware that the speed of light varied in different media, akey fact used when we derive the law of refraction theoretically using Huygens’s principle in Huygens’s Principle.
Example 1.2


Determining the Index of Refraction
Find the index of refraction for medium 2 in Figure 1.13(a), assuming medium 1 is air and given that the incidentangle is 30.0° and the angle of refraction is 22.0° .
Strategy
The index of refraction for air is taken to be 1 in most cases (and up to four significant figures, it is 1.000).Thus, n1 = 1.00 here. From the given information, θ1 = 30.0° and θ2 = 22.0°. With this information, the
only unknown in Snell’s law is n2, so we can use Snell’s law to find it.
Solution
From Snell’s law we have


n1 sin θ1 = n2 sin θ2


n2 = n1
sin θ1
sin θ2


.


Chapter 1 | The Nature of Light 17




1.2


Entering known values,
n2 = 1.00


sin 30.0°
sin 22.0°


= 0.500
0.375


= 1.33.


Significance
This is the index of refraction for water, and Snell could have determined it by measuring the angles andperforming this calculation. He would then have found 1.33 to be the appropriate index of refraction for water inall other situations, such as when a ray passes from water to glass. Today, we can verify that the index of refractionis related to the speed of light in a medium by measuring that speed directly.
Explore bending of light (https://openstaxcollege.org/l/21bendoflight) between two media with differentindices of refraction. Use the “Intro” simulation and see how changing from air to water to glass changes thebending angle. Use the protractor tool to measure the angles and see if you can recreate the configuration inExample 1.2. Also by measurement, confirm that the angle of reflection equals the angle of incidence.


Example 1.3
A Larger Change in Direction
Suppose that in a situation like that in Example 1.2, light goes from air to diamond and that the incident angleis 30.0° . Calculate the angle of refraction θ2 in the diamond.
Strategy
Again, the index of refraction for air is taken to be n1 = 1.00 , and we are given θ1 = 30.0° . We can look up
the index of refraction for diamond in Table 1.1, finding n2 = 2.419 . The only unknown in Snell’s law is θ2 ,
which we wish to determine.
Solution
Solving Snell’s law for sin θ2 yields


sin θ2 =
n1
n2


sin θ1.


Entering known values,
sin θ2 =


1.00
2.419


sin 30.0° = (0.413)(0.500) = 0.207.


The angle is thus
θ2 = sin


−1(0.207) = 11.9°.


Significance
For the same 30.0° angle of incidence, the angle of refraction in diamond is significantly smaller than in water
(11.9° rather than 22.0°—see Example 1.2). This means there is a larger change in direction in diamond. The
cause of a large change in direction is a large change in the index of refraction (or speed). In general, the largerthe change in speed, the greater the effect on the direction of the ray.


Check Your Understanding In Table 1.1, the solid with the next highest index of refraction afterdiamond is zircon. If the diamond in Example 1.3 were replaced with a piece of zircon, what would be thenew angle of refraction?


18 Chapter 1 | The Nature of Light


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




1.4 | Total Internal Reflection
Learning Objectives


By the end of this section, you will be able to:
• Explain the phenomenon of total internal reflection
• Describe the workings and uses of optical fibers
• Analyze the reason for the sparkle of diamonds


A good-quality mirror may reflect more than 90% of the light that falls on it, absorbing the rest. But it would be useful
to have a mirror that reflects all of the light that falls on it. Interestingly, we can produce total reflection using an aspect ofrefraction.
Consider what happens when a ray of light strikes the surface between two materials, as shown in Figure 1.14(a). Partof the light crosses the boundary and is refracted; the rest is reflected. If, as shown in the figure, the index of refractionfor the second medium is less than for the first, the ray bends away from the perpendicular. (Since n1 > n2, the angle
of refraction is greater than the angle of incidence—that is, θ1 > θ2.) Now imagine what happens as the incident angle
increases. This causes θ2 to increase also. The largest the angle of refraction θ2 can be is 90° , as shown in part (b). The
critical angle θc for a combination of materials is defined to be the incident angle θ1 that produces an angle of refraction
of 90° . That is, θc is the incident angle for which θ2 = 90° . If the incident angle θ1 is greater than the critical angle, as
shown in Figure 1.14(c), then all of the light is reflected back into medium 1, a condition called total internal reflection.(As the figure shows, the reflected rays obey the law of reflection so that the angle of reflection is equal to the angle ofincidence in all three cases.)


Figure 1.14 (a) A ray of light crosses a boundary where the index of refraction decreases. That is, n2 < n1. The ray bends
away from the perpendicular. (b) The critical angle θc is the angle of incidence for which the angle of refraction is 90°. (c)
Total internal reflection occurs when the incident angle is greater than the critical angle.


Snell’s law states the relationship between angles and indices of refraction. It is given by
n1 sin θ1 = n2 sin θ2.


When the incident angle equals the critical angle ⎛⎝θ1 = θc⎞⎠ , the angle of refraction is 90° ⎛⎝θ2 = 90°⎞⎠ . Noting that
sin 90° = 1, Snell’s law in this case becomes


n1 sin θ1 = n2.


The critical angle θc for a given combination of materials is thus


Chapter 1 | The Nature of Light 19




1.3


(1.5)θc = sin−1 ⎛⎝n2n1⎞⎠ for n1 > n2.


Total internal reflection occurs for any incident angle greater than the critical angle θc , and it can only occur when the
second medium has an index of refraction less than the first. Note that this equation is written for a light ray that travels inmedium 1 and reflects from medium 2, as shown in Figure 1.14.
Example 1.4


Determining a Critical Angle
What is the critical angle for light traveling in a polystyrene (a type of plastic) pipe surrounded by air? The indexof refraction for polystyrene is 1.49.
Strategy
The index of refraction of air can be taken to be 1.00, as before. Thus, the condition that the second medium (air)has an index of refraction less than the first (plastic) is satisfied, and we can use the equation


θc = sin
−1 ⎛

n2
n1



to find the critical angle θc, where n2 = 1.00 and n1 = 1.49.
Solution
Substituting the identified values gives


θc = sin
−1 ⎛

1.00
1.49

⎠ = sin


−1(0.671) = 42.2°.


Significance
This result means that any ray of light inside the plastic that strikes the surface at an angle greater than 42.2° is
totally reflected. This makes the inside surface of the clear plastic a perfect mirror for such rays, without any needfor the silvering used on common mirrors. Different combinations of materials have different critical angles, butany combination with n1 > n2 can produce total internal reflection. The same calculation as made here shows
that the critical angle for a ray going from water to air is 48.6° , whereas that from diamond to air is 24.4° , and
that from flint glass to crown glass is 66.3° .


Check Your Understanding At the surface between air and water, light rays can go from air to water andfrom water to air. For which ray is there no possibility of total internal reflection?


In the photo that opens this chapter, the image of a swimmer underwater is captured by a camera that is also underwater.The swimmer in the upper half of the photograph, apparently facing upward, is, in fact, a reflected image of the swimmerbelow. The circular ripple near the photograph’s center is actually on the water surface. The undisturbed water surroundingit makes a good reflecting surface when viewed from below, thanks to total internal reflection. However, at the very topedge of this photograph, rays from below strike the surface with incident angles less than the critical angle, allowing thecamera to capture a view of activities on the pool deck above water.
Fiber Optics: Endoscopes to Telephones
Fiber optics is one application of total internal reflection that is in wide use. In communications, it is used to transmittelephone, internet, and cable TV signals. Fiber optics employs the transmission of light down fibers of plastic or glass.Because the fibers are thin, light entering one is likely to strike the inside surface at an angle greater than the critical angleand, thus, be totally reflected (Figure 1.15). The index of refraction outside the fiber must be smaller than inside. In fact,most fibers have a varying refractive index to allow more light to be guided along the fiber through total internal refraction.Rays are reflected around corners as shown, making the fibers into tiny light pipes.


20 Chapter 1 | The Nature of Light


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 1.15 Light entering a thin optic fiber may strike the inside surface at largeor grazing angles and is completely reflected if these angles exceed the criticalangle. Such rays continue down the fiber, even following it around corners, sincethe angles of reflection and incidence remain large.


Bundles of fibers can be used to transmit an image without a lens, as illustrated in Figure 1.16. The output of a devicecalled an endoscope is shown in Figure 1.16(b). Endoscopes are used to explore the interior of the body through its naturalorifices or minor incisions. Light is transmitted down one fiber bundle to illuminate internal parts, and the reflected light istransmitted back out through another bundle to be observed.


Figure 1.16 (a) An image “A” is transmitted by a bundle of optical fibers. (b) An endoscope is used to probe the body,both transmitting light to the interior and returning an image such as the one shown of a human epiglottis (a structure atthe base of the tongue). (credit b: modification of work by “Med_Chaos”/Wikimedia Commons)


Fiber optics has revolutionized surgical techniques and observations within the body, with a host of medical diagnostic andtherapeutic uses. Surgery can be performed, such as arthroscopic surgery on a knee or shoulder joint, employing cuttingtools attached to and observed with the endoscope. Samples can also be obtained, such as by lassoing an intestinal polypfor external examination. The flexibility of the fiber optic bundle allows doctors to navigate it around small and difficult-to-reach regions in the body, such as the intestines, the heart, blood vessels, and joints. Transmission of an intense laser beamto burn away obstructing plaques in major arteries, as well as delivering light to activate chemotherapy drugs, are becomingcommonplace. Optical fibers have in fact enabled microsurgery and remote surgery where the incisions are small and the


Chapter 1 | The Nature of Light 21




surgeon’s fingers do not need to touch the diseased tissue.
Optical fibers in bundles are surrounded by a cladding material that has a lower index of refraction than the core (Figure1.17). The cladding prevents light from being transmitted between fibers in a bundle. Without cladding, light could passbetween fibers in contact, since their indices of refraction are identical. Since no light gets into the cladding (there is totalinternal reflection back into the core), none can be transmitted between clad fibers that are in contact with one another.Instead, the light is propagated along the length of the fiber, minimizing the loss of signal and ensuring that a quality imageis formed at the other end. The cladding and an additional protective layer make optical fibers durable as well as flexible.


Figure 1.17 Fibers in bundles are clad by a material that has alower index of refraction than the core to ensure total internalreflection, even when fibers are in contact with one another.


Special tiny lenses that can be attached to the ends of bundles of fibers have been designed and fabricated. Light emergingfrom a fiber bundle can be focused through such a lens, imaging a tiny spot. In some cases, the spot can be scanned,allowing quality imaging of a region inside the body. Special minute optical filters inserted at the end of the fiber bundlehave the capacity to image the interior of organs located tens of microns below the surface without cutting the surface—anarea known as nonintrusive diagnostics. This is particularly useful for determining the extent of cancers in the stomach andbowel.
In another type of application, optical fibers are commonly used to carry signals for telephone conversations and internetcommunications. Extensive optical fiber cables have been placed on the ocean floor and underground to enable opticalcommunications. Optical fiber communication systems offer several advantages over electrical (copper)-based systems,particularly for long distances. The fibers can be made so transparent that light can travel many kilometers before it becomesdim enough to require amplification—much superior to copper conductors. This property of optical fibers is called low loss.Lasers emit light with characteristics that allow far more conversations in one fiber than are possible with electric signalson a single conductor. This property of optical fibers is called high bandwidth. Optical signals in one fiber do not produceundesirable effects in other adjacent fibers. This property of optical fibers is called reduced crosstalk. We shall explore theunique characteristics of laser radiation in a later chapter.
Corner Reflectors and Diamonds
Corner reflectors (The Law of Reflection) are perfectly efficient when the conditions for total internal reflection aresatisfied. With common materials, it is easy to obtain a critical angle that is less than 45°. One use of these perfect mirrors
is in binoculars, as shown in Figure 1.18. Another use is in periscopes found in submarines.


22 Chapter 1 | The Nature of Light


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 1.18 These binoculars employ corner reflectors(prisms) with total internal reflection to get light to theobserver’s eyes.


Total internal reflection, coupled with a large index of refraction, explains why diamonds sparkle more than other materials.The critical angle for a diamond-to-air surface is only 24.4° , so when light enters a diamond, it has trouble getting back
out (Figure 1.19). Although light freely enters the diamond, it can exit only if it makes an angle less than 24.4° . Facets
on diamonds are specifically intended to make this unlikely. Good diamonds are very clear, so that the light makes manyinternal reflections and is concentrated before exiting—hence the bright sparkle. (Zircon is a natural gemstone that has anexceptionally large index of refraction, but it is not as large as diamond, so it is not as highly prized. Cubic zirconia ismanufactured and has an even higher index of refraction (≈2.17) , but it is still less than that of diamond.) The colors
you see emerging from a clear diamond are not due to the diamond’s color, which is usually nearly colorless. The colorsresult from dispersion, which we discuss in Dispersion. Colored diamonds get their color from structural defects of thecrystal lattice and the inclusion of minute quantities of graphite and other materials. The Argyle Mine in Western Australiaproduces around 90% of the world’s pink, red, champagne, and cognac diamonds, whereas around 50% of the world’s cleardiamonds come from central and southern Africa.


Figure 1.19 Light cannot easily escape a diamond, because itscritical angle with air is so small. Most reflections are total, andthe facets are placed so that light can exit only in particularways—thus concentrating the light and making the diamondsparkle brightly.


Chapter 1 | The Nature of Light 23




Explore refraction and reflection of light (https://openstaxcollege.org/l/21bendoflight) between twomedia with different indices of refraction. Try to make the refracted ray disappear with total internal reflection. Usethe protractor tool to measure the critical angle and compare with the prediction from Equation 1.5.


1.5 | Dispersion
Learning Objectives


By the end of this section, you will be able to:
• Explain the cause of dispersion in a prism
• Describe the effects of dispersion in producing rainbows
• Summarize the advantages and disadvantages of dispersion


Everyone enjoys the spectacle of a rainbow glimmering against a dark stormy sky. How does sunlight falling on clear dropsof rain get broken into the rainbow of colors we see? The same process causes white light to be broken into colors by a clearglass prism or a diamond (Figure 1.20).


Figure 1.20 The colors of the rainbow (a) and those produced by a prism (b) are identical. (credit a: modification of work by“Alfredo55”/Wikimedia Commons; credit b: modification of work by NASA)


We see about six colors in a rainbow—red, orange, yellow, green, blue, and violet; sometimes indigo is listed, too. Thesecolors are associated with different wavelengths of light, as shown in Figure 1.21. When our eye receives pure-wavelengthlight, we tend to see only one of the six colors, depending on wavelength. The thousands of other hues we can sense in othersituations are our eye’s response to various mixtures of wavelengths. White light, in particular, is a fairly uniform mixtureof all visible wavelengths. Sunlight, considered to be white, actually appears to be a bit yellow, because of its mixture ofwavelengths, but it does contain all visible wavelengths. The sequence of colors in rainbows is the same sequence as thecolors shown in the figure. This implies that white light is spread out in a rainbow according to wavelength. Dispersion isdefined as the spreading of white light into its full spectrum of wavelengths. More technically, dispersion occurs wheneverthe propagation of light depends on wavelength.


Figure 1.21 Even though rainbows are associated with six colors, the rainbow is a continuousdistribution of colors according to wavelengths.


Any type of wave can exhibit dispersion. For example, sound waves, all types of electromagnetic waves, and water wavescan be dispersed according to wavelength. Dispersion may require special circumstances and can result in spectaculardisplays such as in the production of a rainbow. This is also true for sound, since all frequencies ordinarily travel at thesame speed. If you listen to sound through a long tube, such as a vacuum cleaner hose, you can easily hear it dispersed by


24 Chapter 1 | The Nature of Light


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




interaction with the tube. Dispersion, in fact, can reveal a great deal about what the wave has encountered that dispersesits wavelengths. The dispersion of electromagnetic radiation from outer space, for example, has revealed much about whatexists between the stars—the so-called interstellar medium.
Nick Moore’s video (https://openstaxcollege.org/l/21nickmoorevid) discusses dispersion of a pulse as hetaps a long spring. Follow his explanation as Moore replays the high-speed footage showing high frequency wavesoutrunning the lower frequency waves.


Refraction is responsible for dispersion in rainbows and many other situations. The angle of refraction depends on the indexof refraction, as we know from Snell’s law. We know that the index of refraction n depends on the medium. But for a givenmedium, n also depends on wavelength (Table 1.2). Note that for a given medium, n increases as wavelength decreasesand is greatest for violet light. Thus, violet light is bent more than red light, as shown for a prism in Figure 1.22(b). Whitelight is dispersed into the same sequence of wavelengths as seen in Figure 1.20 and Figure 1.21.
Medium Red(660 nm) Orange(610 nm) Yellow(580 nm) Green(550 nm) Blue(470 nm) Violet(410 nm)
Water 1.331 1.332 1.333 1.335 1.338 1.342
Diamond 2.410 2.415 2.417 2.426 2.444 2.458
Glass, crown 1.512 1.514 1.518 1.519 1.524 1.530
Glass, flint 1.662 1.665 1.667 1.674 1.684 1.698
Polystyrene 1.488 1.490 1.492 1.493 1.499 1.506
Quartz, fused 1.455 1.456 1.458 1.459 1.462 1.468


Table 1.2 Index of Refraction n in Selected Media at Various Wavelengths


Figure 1.22 (a) A pure wavelength of light falls onto a prism and is refracted at bothsurfaces. (b) White light is dispersed by the prism (shown exaggerated). Since the indexof refraction varies with wavelength, the angles of refraction vary with wavelength. Asequence of red to violet is produced, because the index of refraction increases steadilywith decreasing wavelength.


Example 1.5
Dispersion of White Light by Flint Glass
A beam of white light goes from air into flint glass at an incidence angle of 43.2° . What is the angle between the
red (660 nm) and violet (410 nm) parts of the refracted light?


Chapter 1 | The Nature of Light 25




1.4


Strategy
Values for the indices of refraction for flint glass at various wavelengths are listed in Table 1.2. Use these valuesfor calculate the angle of refraction for each color and then take the difference to find the dispersion angle.
Solution
Applying the law of refraction for the red part of the beam


nair sin θair = nred sin θred,


we can solve for the angle of refraction as
θred = sin


−1 ⎛

nair sin θair


nred

⎠ = sin


−1 ⎡

(1.000) sin 43.2°


(1.662)

⎦ = 27.0°.


Similarly, the angle of incidence for the violet part of the beam is
θviolet = sin


−1 ⎛

nair sin θair


nviolet

⎠ = sin


−1 ⎡

(1.000) sin 43.2°


(1.698)

⎦ = 26.4°.


The difference between these two angles is
θred − θviolet = 27.0° − 26.4° = 0.6°.


Significance
Although 0.6° may seem like a negligibly small angle, if this beam is allowed to propagate a long enough
distance, the dispersion of colors becomes quite noticeable.


Check Your Understanding In the preceding example, how much distance inside the block of flint glasswould the red and the violet rays have to progress before they are separated by 1.0 mm?


Rainbows are produced by a combination of refraction and reflection. You may have noticed that you see a rainbow onlywhen you look away from the Sun. Light enters a drop of water and is reflected from the back of the drop (Figure 1.23).The light is refracted both as it enters and as it leaves the drop. Since the index of refraction of water varies with wavelength,the light is dispersed, and a rainbow is observed (Figure 1.24(a)). (No dispersion occurs at the back surface, because thelaw of reflection does not depend on wavelength.) The actual rainbow of colors seen by an observer depends on the myriadrays being refracted and reflected toward the observer’s eyes from numerous drops of water. The effect is most spectacularwhen the background is dark, as in stormy weather, but can also be observed in waterfalls and lawn sprinklers. The arc of a


26 Chapter 1 | The Nature of Light


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




rainbow comes from the need to be looking at a specific angle relative to the direction of the Sun, as illustrated in part (b). Iftwo reflections of light occur within the water drop, another “secondary” rainbow is produced. This rare event produces anarc that lies above the primary rainbow arc, as in part (c), and produces colors in the reverse order of the primary rainbow,with red at the lowest angle and violet at the largest angle.


Figure 1.23 A ray of light falling on this water drop enters and isreflected from the back of the drop. This light is refracted anddispersed both as it enters and as it leaves the drop.


Figure 1.24 (a) Different colors emerge in different directions, and so you must look at different locations to see the variouscolors of a rainbow. (b) The arc of a rainbow results from the fact that a line between the observer and any point on the arc mustmake the correct angle with the parallel rays of sunlight for the observer to receive the refracted rays. (c) Double rainbow. (creditc: modification of work by “Nicholas”/Wikimedia Commons)


Dispersion may produce beautiful rainbows, but it can cause problems in optical systems. White light used to transmitmessages in a fiber is dispersed, spreading out in time and eventually overlapping with other messages. Since a laserproduces a nearly pure wavelength, its light experiences little dispersion, an advantage over white light for transmission ofinformation. In contrast, dispersion of electromagnetic waves coming to us from outer space can be used to determine theamount of matter they pass through.


Chapter 1 | The Nature of Light 27




1.6 | Huygens’s Principle
Learning Objectives


By the end of this section, you will be able to:
• Describe Huygens’s principle
• Use Huygens’s principle to explain the law of reflection
• Use Huygens’s principle to explain the law of refraction
• Use Huygens’s principle to explain diffraction


So far in this chapter, we have been discussing optical phenomena using the ray model of light. However, some phenomenarequire analysis and explanations based on the wave characteristics of light. This is particularly true when the wavelength isnot negligible compared to the dimensions of an optical device, such as a slit in the case of diffraction. Huygens’s principleis an indispensable tool for this analysis.
Figure 1.25 shows how a transverse wave looks as viewed from above and from the side. A light wave can be imaginedto propagate like this, although we do not actually see it wiggling through space. From above, we view the wave fronts (orwave crests) as if we were looking down on ocean waves. The side view would be a graph of the electric or magnetic field.The view from above is perhaps more useful in developing concepts about wave optics.


Figure 1.25 A transverse wave, such as an electromagnetic light wave, as viewed from above and fromthe side. The direction of propagation is perpendicular to the wave fronts (or wave crests) and isrepresented by a ray.


The Dutch scientist Christiaan Huygens (1629–1695) developed a useful technique for determining in detail how and wherewaves propagate. Starting from some known position, Huygens’s principle states that every point on a wave front is asource of wavelets that spread out in the forward direction at the same speed as the wave itself. The new wave front istangent to all of the wavelets.
Figure 1.26 shows how Huygens’s principle is applied. A wave front is the long edge that moves, for example, with thecrest or the trough. Each point on the wave front emits a semicircular wave that moves at the propagation speed v. We candraw these wavelets at a time t later, so that they have moved a distance s = vt. The new wave front is a plane tangent to
the wavelets and is where we would expect the wave to be a time t later. Huygens’s principle works for all types of waves,including water waves, sound waves, and light waves. It is useful not only in describing how light waves propagate but alsoin explaining the laws of reflection and refraction. In addition, we will see that Huygens’s principle tells us how and wherelight rays interfere.


28 Chapter 1 | The Nature of Light


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 1.26 Huygens’s principle applied to a straight wavefront. Each point on the wave front emits a semicircular waveletthat moves a distance s = vt. The new wave front is a line
tangent to the wavelets.


Reflection
Figure 1.27 shows how a mirror reflects an incoming wave at an angle equal to the incident angle, verifying the law ofreflection. As the wave front strikes the mirror, wavelets are first emitted from the left part of the mirror and then from theright. The wavelets closer to the left have had time to travel farther, producing a wave front traveling in the direction shown.


Figure 1.27 Huygens’s principle applied to a plane wave frontstriking a mirror. The wavelets shown were emitted as eachpoint on the wave front struck the mirror. The tangent to thesewavelets shows that the new wave front has been reflected at anangle equal to the incident angle. The direction of propagation isperpendicular to the wave front, as shown by the downward-pointing arrows.
Refraction
The law of refraction can be explained by applying Huygens’s principle to a wave front passing from one medium to another(Figure 1.28). Each wavelet in the figure was emitted when the wave front crossed the interface between the media. Sincethe speed of light is smaller in the second medium, the waves do not travel as far in a given time, and the new wave frontchanges direction as shown. This explains why a ray changes direction to become closer to the perpendicular when lightslows down. Snell’s law can be derived from the geometry in Figure 1.28 (Example 1.6).


Chapter 1 | The Nature of Light 29




Figure 1.28 Huygens’s principle applied to a plane wave fronttraveling from one medium to another, where its speed is less. Theray bends toward the perpendicular, since the wavelets have a lowerspeed in the second medium.


Example 1.6
Deriving the Law of Refraction
By examining the geometry of the wave fronts, derive the law of refraction.
Strategy
Consider Figure 1.29, which expands upon Figure 1.28. It shows the incident wave front just reaching thesurface at point A, while point B is still well within medium 1. In the time Δt it takes for a wavelet from
B to reach B′ on the surface at speed v1 = c/n1, a wavelet from A travels into medium 2 a distance of
AA′ = v2Δt, where v2 = c/n2. Note that in this example, v2 is slower than v1 because n1 < n2.


Figure 1.29 Geometry of the law of refraction from medium 1 to medium 2.


30 Chapter 1 | The Nature of Light


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




1.5


Solution
The segment on the surface AB′ is shared by both the triangle ABB′ inside medium 1 and the triangle AA′B′
inside medium 2. Note that from the geometry, the angle ∠BAB′ is equal to the angle of incidence, θ1 .
Similarly, ∠AB′A′ is θ2 .
The length of AB′ is given in two ways as


AB′ = BB′
sin θ1


= AA′
sin θ2


.


Inverting the equation and substituting AA′ = cΔt/n2 from above and similarly BB′ = cΔt/n1 , we obtain
sin θ1
cΔt/n1


=
sin θ2
cΔt/n2


.


Cancellation of cΔt allows us to simplify this equation into the familiar form
n1 sin θ1 = n2 sin θ2.


Significance
Although the law of refraction was established experimentally by Snell and stated in Refraction, its derivationhere requires Huygens’s principle and the understanding that the speed of light is different in different media.


Check Your Understanding In Example 1.6, we had n1 < n2 . If n2 were decreased such that
n1 > n2 and the speed of light in medium 2 is faster than in medium 1, what would happen to the length of
AA′ ? What would happen to the wave front A′B′ and the direction of the refracted ray?


This applet (https://openstaxcollege.org/l/21walfedaniref) by Walter Fendt shows an animation ofreflection and refraction using Huygens’s wavelets while you control the parameters. Be sure to click on “Nextstep” to display the wavelets. You can see the reflected and refracted wave fronts forming.


Diffraction
What happens when a wave passes through an opening, such as light shining through an open door into a dark room? Forlight, we observe a sharp shadow of the doorway on the floor of the room, and no visible light bends around corners intoother parts of the room. When sound passes through a door, we hear it everywhere in the room and thus observe that soundspreads out when passing through such an opening (Figure 1.30). What is the difference between the behavior of soundwaves and light waves in this case? The answer is that light has very short wavelengths and acts like a ray. Sound haswavelengths on the order of the size of the door and bends around corners (for frequency of 1000 Hz,


λ = c
f
= 330 m/s


1000 s−1
= 0.33 m,


about three times smaller than the width of the doorway).


Chapter 1 | The Nature of Light 31




Figure 1.30 (a) Light passing through a doorway makes a sharp outline on the floor. Since light’swavelength is very small compared with the size of the door, it acts like a ray. (b) Sound waves bendinto all parts of the room, a wave effect, because their wavelength is similar to the size of the door.


If we pass light through smaller openings such as slits, we can use Huygens’s principle to see that light bends as sound does(Figure 1.31). The bending of a wave around the edges of an opening or an obstacle is called diffraction. Diffraction is awave characteristic and occurs for all types of waves. If diffraction is observed for some phenomenon, it is evidence that thephenomenon is a wave. Thus, the horizontal diffraction of the laser beam after it passes through the slits in Figure 1.31 isevidence that light is a wave. You will learn about diffraction in much more detail in the chapter on Diffraction.


Figure 1.31 Huygens’s principle applied to a plane wave front striking an opening. The edgesof the wave front bend after passing through the opening, a process called diffraction. Theamount of bending is more extreme for a small opening, consistent with the fact that wavecharacteristics are most noticeable for interactions with objects about the same size as thewavelength.


32 Chapter 1 | The Nature of Light


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




1.7 | Polarization
Learning Objectives


By the end of this section, you will be able to:
• Explain the change in intensity as polarized light passes through a polarizing filter
• Calculate the effect of polarization by reflection and Brewster’s angle
• Describe the effect of polarization by scattering
• Explain the use of polarizing materials in devices such as LCDs


Polarizing sunglasses are familiar to most of us. They have a special ability to cut the glare of light reflected from wateror glass (Figure 1.32). They have this ability because of a wave characteristic of light called polarization. What ispolarization? How is it produced? What are some of its uses? The answers to these questions are related to the wavecharacter of light.


Figure 1.32 These two photographs of a river show the effect of a polarizing filter in reducing glare in light reflected from thesurface of water. Part (b) of this figure was taken with a polarizing filter and part (a) was not. As a result, the reflection of cloudsand sky observed in part (a) is not observed in part (b). Polarizing sunglasses are particularly useful on snow and water. (credit aand credit b: modifications of work by “Amithshs”/Wikimedia Commons)
Malus’s Law
Light is one type of electromagnetic (EM) wave. As noted in the previous chapter on Electromagnetic Waves(http://cnx.org/content/m58495/latest/) , EM waves are transverse waves consisting of varying electric and magneticfields that oscillate perpendicular to the direction of propagation (Figure 1.33). However, in general, there are no specificdirections for the oscillations of the electric and magnetic fields; they vibrate in any randomly oriented plane perpendicularto the direction of propagation. Polarization is the attribute that a wave’s oscillations do have a definite direction relativeto the direction of propagation of the wave. (This is not the same type of polarization as that discussed for the separation ofcharges.) Waves having such a direction are said to be polarized. For an EM wave, we define the direction of polarizationto be the direction parallel to the electric field. Thus, we can think of the electric field arrows as showing the direction ofpolarization, as in Figure 1.33.


Chapter 1 | The Nature of Light 33




Figure 1.33 An EM wave, such as light, is a transverse wave.
The electric ( E→ ) and magnetic ( B→ ) fields are
perpendicular to the direction of propagation. The direction ofpolarization of the wave is the direction of the electric field.


To examine this further, consider the transverse waves in the ropes shown in Figure 1.34. The oscillations in one ropeare in a vertical plane and are said to be vertically polarized. Those in the other rope are in a horizontal plane and arehorizontally polarized. If a vertical slit is placed on the first rope, the waves pass through. However, a vertical slit blocksthe horizontally polarized waves. For EM waves, the direction of the electric field is analogous to the disturbances on theropes.


Figure 1.34 The transverse oscillations in one rope (a) are in a vertical plane, and those in the other rope (b)are in a horizontal plane. The first is said to be vertically polarized, and the other is said to be horizontallypolarized. Vertical slits pass vertically polarized waves and block horizontally polarized waves.


The Sun and many other light sources produce waves that have the electric fields in random directions (Figure 1.35(a)).Such light is said to be unpolarized, because it is composed of many waves with all possible directions of polarization.Polaroid materials—which were invented by the founder of the Polaroid Corporation, Edwin Land—act as a polarizing slitfor light, allowing only polarization in one direction to pass through. Polarizing filters are composed of long moleculesaligned in one direction. If we think of the molecules as many slits, analogous to those for the oscillating ropes, we canunderstand why only light with a specific polarization can get through. The axis of a polarizing filter is the direction alongwhich the filter passes the electric field of an EM wave.


34 Chapter 1 | The Nature of Light


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 1.35 The slender arrow represents a ray of unpolarized light. The bold arrows represent the direction ofpolarization of the individual waves composing the ray. (a) If the light is unpolarized, the arrows point in all directions. (b)A polarizing filter has a polarization axis that acts as a slit passing through electric fields parallel to its direction. Thedirection of polarization of an EM wave is defined to be the direction of its electric field.


Figure 1.36 shows the effect of two polarizing filters on originally unpolarized light. The first filter polarizes the lightalong its axis. When the axes of the first and second filters are aligned (parallel), then all of the polarized light passed bythe first filter is also passed by the second filter. If the second polarizing filter is rotated, only the component of the lightparallel to the second filter’s axis is passed. When the axes are perpendicular, no light is passed by the second filter.


Figure 1.36 The effect of rotating two polarizing filters, where the first polarizes the light. (a) All of the polarized light ispassed by the second polarizing filter, because its axis is parallel to the first. (b) As the second filter is rotated, only part of thelight is passed. (c) When the second filter is perpendicular to the first, no light is passed. (d) In this photograph, a polarizing filteris placed above two others. Its axis is perpendicular to the filter on the right (dark area) and parallel to the filter on the left (lighterarea). (credit d: modification of work by P.P. Urone)


Chapter 1 | The Nature of Light 35




Only the component of the EM wave parallel to the axis of a filter is passed. Let us call the angle between the direction ofpolarization and the axis of a filter θ . If the electric field has an amplitude E, then the transmitted part of the wave has an
amplitude E cos θ (Figure 1.37). Since the intensity of a wave is proportional to its amplitude squared, the intensity I of
the transmitted wave is related to the incident wave by


(1.6)I = I0 cos2 θ


where I0 is the intensity of the polarized wave before passing through the filter. This equation is known as Malus’s law.


Figure 1.37 A polarizing filter transmits only the componentof the wave parallel to its axis, reducing the intensity of anylight not polarized parallel to its axis.
This Open Source Physics animation (https://openstaxcollege.org/l/21phyanielefie) helps youvisualize the electric field vectors as light encounters a polarizing filter. You can rotate the filter—note that theangle displayed is in radians. You can also rotate the animation for 3D visualization.


Example 1.7
Calculating Intensity Reduction by a Polarizing Filter
What angle is needed between the direction of polarized light and the axis of a polarizing filter to reduce itsintensity by 90.0% ?
Strategy
When the intensity is reduced by 90.0% , it is 10.0% or 0.100 times its original value. That is, I = 0.100 I0.
Using this information, the equation I = I0 cos2 θ can be used to solve for the needed angle.
Solution
Solving the equation I = I0 cos2 θ for cos θ and substituting with the relationship between I and I0 gives


cos θ = I
I0


=
0.100 I0


I0
= 0.3162.


Solving for θ yields
θ = cos−1 0.3162 = 71.6°.


Significance
A fairly large angle between the direction of polarization and the filter axis is needed to reduce the intensity to


36 Chapter 1 | The Nature of Light


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




1.6


10.0% of its original value. This seems reasonable based on experimenting with polarizing films. It is interesting
that at an angle of 45° , the intensity is reduced to 50% of its original value. Note that 71.6° is 18.4° from
reducing the intensity to zero, and that at an angle of 18.4° , the intensity is reduced to 90.0% of its original
value, giving evidence of symmetry.


Check Your Understanding Although we did not specify the direction in Example 1.7, let’s say thepolarizing filter was rotated clockwise by 71.6° to reduce the light intensity by 90.0% . What would be the
intensity reduction if the polarizing filter were rotated counterclockwise by 71.6° ?


Polarization by Reflection
By now, you can probably guess that polarizing sunglasses cut the glare in reflected light, because that light is polarized.You can check this for yourself by holding polarizing sunglasses in front of you and rotating them while looking at lightreflected from water or glass. As you rotate the sunglasses, you will notice the light gets bright and dim, but not completelyblack. This implies the reflected light is partially polarized and cannot be completely blocked by a polarizing filter.
Figure 1.38 illustrates what happens when unpolarized light is reflected from a surface. Vertically polarized light ispreferentially refracted at the surface, so the reflected light is left more horizontally polarized. The reasons for thisphenomenon are beyond the scope of this text, but a convenient mnemonic for remembering this is to imagine thepolarization direction to be like an arrow. Vertical polarization is like an arrow perpendicular to the surface and is morelikely to stick and not be reflected. Horizontal polarization is like an arrow bouncing on its side and is more likely to bereflected. Sunglasses with vertical axes thus block more reflected light than unpolarized light from other sources.


Figure 1.38 Polarization by reflection. Unpolarized light has equal amounts of vertical and horizontalpolarization. After interaction with a surface, the vertical components are preferentially absorbed or refracted,leaving the reflected light more horizontally polarized. This is akin to arrows striking on their sides andbouncing off, whereas arrows striking on their tips go into the surface.


Since the part of the light that is not reflected is refracted, the amount of polarization depends on the indices of refraction ofthe media involved. It can be shown that reflected light is completely polarized at an angle of reflection θb given by


(1.7)tan θb = n2n1


Chapter 1 | The Nature of Light 37




1.7


where n1 is the medium in which the incident and reflected light travel and n2 is the index of refraction of the medium
that forms the interface that reflects the light. This equation is known as Brewster’s law and θb is known as Brewster’s
angle, named after the nineteenth-century Scottish physicist who discovered them.


This Open Source Physics animation (https://openstaxcollege.org/l/21phyaniincref) shows incident,reflected, and refracted light as rays and EMwaves. Try rotating the animation for 3D visualization and also changethe angle of incidence. Near Brewster’s angle, the reflected light becomes highly polarized.


Example 1.8
Calculating Polarization by Reflection
(a) At what angle will light traveling in air be completely polarized horizontally when reflected from water? (b)From glass?
Strategy
All we need to solve these problems are the indices of refraction. Air has n1 = 1.00, water has n2 = 1.333,
and crown glass has n2′ = 1.520. The equation tan θb = n2n1 can be directly applied to find θb in each case.
Solutiona. Putting the known quantities into the equation


tan θb =
n2
n1


gives
tan θb =


n2
n1


= 1.333
1.00


= 1.333.


Solving for the angle θb yields
θb = tan


−1 1.333 = 53.1°.


b. Similarly, for crown glass and air,
tan θb


′ =
n2


n1
= 1.520


1.00
= 1.52.


Thus,
θb
′ = tan−1 1.52 = 56.7°.


Significance
Light reflected at these angles could be completely blocked by a good polarizing filter held with its axis vertical.Brewster’s angle for water and air are similar to those for glass and air, so that sunglasses are equally effectivefor light reflected from either water or glass under similar circumstances. Light that is not reflected is refractedinto these media. Therefore, at an incident angle equal to Brewster’s angle, the refracted light is slightly polarizedvertically. It is not completely polarized vertically, because only a small fraction of the incident light is reflected,so a significant amount of horizontally polarized light is refracted.


Check Your Understanding What happens at Brewster’s angle if the original incident light is already
100% vertically polarized?


Atomic Explanation of Polarizing Filters
Polarizing filters have a polarization axis that acts as a slit. This slit passes EM waves (often visible light) that have an


38 Chapter 1 | The Nature of Light


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




electric field parallel to the axis. This is accomplished with long molecules aligned perpendicular to the axis, as shown inFigure 1.39.


Figure 1.39 Long molecules are aligned perpendicular to theaxis of a polarizing filter. In an EM wave, the component of theelectric field perpendicular to these molecules passes throughthe filter, whereas the component parallel to the molecules isabsorbed.


Figure 1.40 illustrates how the component of the electric field parallel to the long molecules is absorbed. An EM waveis composed of oscillating electric and magnetic fields. The electric field is strong compared with the magnetic field andis more effective in exerting force on charges in the molecules. The most affected charged particles are the electrons, sinceelectron masses are small. If an electron is forced to oscillate, it can absorb energy from the EM wave. This reduces the fieldin the wave and, hence, reduces its intensity. In long molecules, electrons can more easily oscillate parallel to the moleculethan in the perpendicular direction. The electrons are bound to the molecule and are more restricted in their movementperpendicular to the molecule. Thus, the electrons can absorb EMwaves that have a component of their electric field parallelto the molecule. The electrons are much less responsive to electric fields perpendicular to the molecule and allow thesefields to pass. Thus, the axis of the polarizing filter is perpendicular to the length of the molecule.


Figure 1.40 Diagram of an electron in a long molecule oscillating parallel to the molecule. The oscillation of the electronabsorbs energy and reduces the intensity of the component of the EM wave that is parallel to the molecule.
Polarization by Scattering
If you hold your polarizing sunglasses in front of you and rotate them while looking at blue sky, you will see the sky getbright and dim. This is a clear indication that light scattered by air is partially polarized. Figure 1.41 helps illustrate how


Chapter 1 | The Nature of Light 39




this happens. Since light is a transverse EM wave, it vibrates the electrons of air molecules perpendicular to the directionthat it is traveling. The electrons then radiate like small antennae. Since they are oscillating perpendicular to the direction ofthe light ray, they produce EM radiation that is polarized perpendicular to the direction of the ray. When viewing the lightalong a line perpendicular to the original ray, as in the figure, there can be no polarization in the scattered light parallel to theoriginal ray, because that would require the original ray to be a longitudinal wave. Along other directions, a component ofthe other polarization can be projected along the line of sight, and the scattered light is only partially polarized. Furthermore,multiple scattering can bring light to your eyes from other directions and can contain different polarizations.


Figure 1.41 Polarization by scattering. Unpolarized light scattering from air molecules shakes their electronsperpendicular to the direction of the original ray. The scattered light therefore has a polarization perpendicular to theoriginal direction and none parallel to the original direction.


Photographs of the sky can be darkened by polarizing filters, a trick used by many photographers to make clouds brighter bycontrast. Scattering from other particles, such as smoke or dust, can also polarize light. Detecting polarization in scatteredEM waves can be a useful analytical tool in determining the scattering source.
A range of optical effects are used in sunglasses. Besides being polarizing, sunglasses may have colored pigments embeddedin them, whereas others use either a nonreflective or reflective coating. A recent development is photochromic lenses,which darken in the sunlight and become clear indoors. Photochromic lenses are embedded with organic microcrystallinemolecules that change their properties when exposed to UV in sunlight, but become clear in artificial lighting with no UV.
Liquid Crystals and Other Polarization Effects in Materials
Although you are undoubtedly aware of liquid crystal displays (LCDs) found in watches, calculators, computer screens,cellphones, flat screen televisions, and many other places, you may not be aware that they are based on polarization. Liquidcrystals are so named because their molecules can be aligned even though they are in a liquid. Liquid crystals have theproperty that they can rotate the polarization of light passing through them by 90° . Furthermore, this property can be turned
off by the application of a voltage, as illustrated in Figure 1.42. It is possible to manipulate this characteristic quickly andin small, well-defined regions to create the contrast patterns we see in so many LCD devices.
In flat screen LCD televisions, a large light is generated at the back of the TV. The light travels to the front screen throughmillions of tiny units called pixels (picture elements). One of these is shown in Figure 1.42(a) and (b). Each unit has threecells, with red, blue, or green filters, each controlled independently. When the voltage across a liquid crystal is switched off,the liquid crystal passes the light through the particular filter. We can vary the picture contrast by varying the strength of thevoltage applied to the liquid crystal.


40 Chapter 1 | The Nature of Light


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 1.42 (a) Polarized light is rotated 90° by a liquid crystal and then passed by a polarizing filter that has its axis
perpendicular to the direction of the original polarization. (b) When a voltage is applied to the liquid crystal, the polarized light isnot rotated and is blocked by the filter, making the region dark in comparison with its surroundings. (c) LCDs can be made colorspecific, small, and fast enough to use in laptop computers and TVs.


Many crystals and solutions rotate the plane of polarization of light passing through them. Such substances are said to beoptically active. Examples include sugar water, insulin, and collagen (Figure 1.43). In addition to depending on the typeof substance, the amount and direction of rotation depend on several other factors. Among these is the concentration of thesubstance, the distance the light travels through it, and the wavelength of light. Optical activity is due to the asymmetricalshape of molecules in the substance, such as being helical. Measurements of the rotation of polarized light passing throughsubstances can thus be used to measure concentrations, a standard technique for sugars. It can also give information on theshapes of molecules, such as proteins, and factors that affect their shapes, such as temperature and pH.


Figure 1.43 Optical activity is the ability of some substances to rotate theplane of polarization of light passing through them. The rotation is detectedwith a polarizing filter or analyzer.


Glass and plastic become optically active when stressed: the greater the stress, the greater the effect. Optical stress analysison complicated shapes can be performed by making plastic models of them and observing them through crossed filters, asseen in Figure 1.44. It is apparent that the effect depends on wavelength as well as stress. The wavelength dependence is


Chapter 1 | The Nature of Light 41




sometimes also used for artistic purposes.


Figure 1.44 Optical stress analysis of a plastic lens placedbetween crossed polarizers. (credit: “Infopro”/WikimediaCommons)


Another interesting phenomenon associated with polarized light is the ability of some crystals to split an unpolarized beamof light into two polarized beams. This occurs because the crystal has one value for the index of refraction of polarized lightbut a different value for the index of refraction of light polarized in the perpendicular direction, so that each componenthas its own angle of refraction. Such crystals are said to be birefringent, and, when aligned properly, two perpendicularlypolarized beams will emerge from the crystal (Figure 1.45). Birefringent crystals can be used to produce polarized beamsfrom unpolarized light. Some birefringent materials preferentially absorb one of the polarizations. These materials are calleddichroic and can produce polarization by this preferential absorption. This is fundamentally how polarizing filters and otherpolarizers work.


Figure 1.45 Birefringent materials, such as the common mineral calcite, split unpolarized beams of light into two with twodifferent values of index of refraction.


42 Chapter 1 | The Nature of Light


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




birefringent
Brewster’s angle
Brewster’s law


corner reflector
critical angle
direction of polarization
dispersion
fiber optics
geometric optics
horizontally polarized
Huygens’s principle
index of refraction
law of reflection
law of refraction
Malus’s law
optically active
polarization
polarized
ray
refraction
total internal reflection
unpolarized
vertically polarized
wave optics


CHAPTER 1 REVIEW
KEY TERMS


refers to crystals that split an unpolarized beam of light into two beams
angle of incidence at which the reflected light is completely polarized


tan θb =
n2
n1
, where n1 is the medium in which the incident and reflected light travel and n2 is the


index of refraction of the medium that forms the interface that reflects the light
object consisting of two (or three) mutually perpendicular reflecting surfaces, so that the light thatenters is reflected back exactly parallel to the direction from which it came


incident angle that produces an angle of refraction of 90°
direction parallel to the electric field for EM waves


spreading of light into its spectrum of wavelengths
field of study of the transmission of light down fibers of plastic or glass, applying the principle of totalinternal reflection


part of optics dealing with the ray aspect of light
oscillations are in a horizontal plane


every point on a wave front is a source of wavelets that spread out in the forward direction at thesame speed as the wave itself; the new wave front is a plane tangent to all of the wavelets
for a material, the ratio of the speed of light in a vacuum to that in a material


angle of reflection equals the angle of incidence
when a light ray crosses from one medium to another, it changes direction by an amount that dependson the index of refraction of each medium and the sines of the angle of incidence and angle of refraction


where I0 is the intensity of the polarized wave before passing through the filter
substances that rotate the plane of polarization of light passing through them


attribute that wave oscillations have a definite direction relative to the direction of propagation of the wave
refers to waves having the electric and magnetic field oscillations in a definite direction


straight line that originates at some point
changing of a light ray’s direction when it passes through variations in matter


phenomenon at the boundary between two media such that all the light is reflected and norefraction occurs
refers to waves that are randomly polarized


oscillations are in a vertical plane
part of optics dealing with the wave aspect of light


KEY EQUATIONS
Speed of light c = 2.99792458 × 108 m/s ≈ 3.00 × 108 m/s
Index of refraction n = cv
Law of reflection θr = θi


Chapter 1 | The Nature of Light 43




Law of refraction (Snell’s law) n1 sin θ1 = n2 sin θ2
Critical angle θc = sin−1 ⎛⎝n2n1⎞⎠ for n1 > n2
Malus’s law I = I0 cos2 θ
Brewster’s law tan θb = n2n1


SUMMARY
1.1 The Propagation of Light


• The speed of light in a vacuum is c = 2.99792458 × 108 m/s ≈ 3.00 × 108 m/s .
• The index of refraction of a material is n = c/v, where v is the speed of light in a material and c is the speed of
light in a vacuum.


• The ray model of light describes the path of light as straight lines. The part of optics dealing with the ray aspect oflight is called geometric optics.
• Light can travel in three ways from a source to another location: (1) directly from the source through empty space;(2) through various media; and (3) after being reflected from a mirror.


1.2 The Law of Reflection
• When a light ray strikes a smooth surface, the angle of reflection equals the angle of incidence.
• A mirror has a smooth surface and reflects light at specific angles.
• Light is diffused when it reflects from a rough surface.


1.3 Refraction
• The change of a light ray’s direction when it passes through variations in matter is called refraction.
• The law of refraction, also called Snell’s law, relates the indices of refraction for two media at an interface to thechange in angle of a light ray passing through that interface.


1.4 Total Internal Reflection
• The incident angle that produces an angle of refraction of 90° is called the critical angle.
• Total internal reflection is a phenomenon that occurs at the boundary between two media, such that if the incidentangle in the first medium is greater than the critical angle, then all the light is reflected back into that medium.
• Fiber optics involves the transmission of light down fibers of plastic or glass, applying the principle of total internalreflection.
• Cladding prevents light from being transmitted between fibers in a bundle.
• Diamonds sparkle due to total internal reflection coupled with a large index of refraction.


1.5 Dispersion
• The spreading of white light into its full spectrum of wavelengths is called dispersion.
• Rainbows are produced by a combination of refraction and reflection, and involve the dispersion of sunlight into acontinuous distribution of colors.
• Dispersion produces beautiful rainbows but also causes problems in certain optical systems.


44 Chapter 1 | The Nature of Light


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




1.6 Huygens’s Principle
• According to Huygens’s principle, every point on a wave front is a source of wavelets that spread out in the forwarddirection at the same speed as the wave itself. The new wave front is tangent to all of the wavelets.
• A mirror reflects an incoming wave at an angle equal to the incident angle, verifying the law of reflection.
• The law of refraction can be explained by applying Huygens’s principle to a wave front passing from one mediumto another.
• The bending of a wave around the edges of an opening or an obstacle is called diffraction.


1.7 Polarization
• Polarization is the attribute that wave oscillations have a definite direction relative to the direction of propagationof the wave. The direction of polarization is defined to be the direction parallel to the electric field of the EM wave.
• Unpolarized light is composed of many rays having random polarization directions.
• Unpolarized light can be polarized by passing it through a polarizing filter or other polarizing material. The processof polarizing light decreases its intensity by a factor of 2.
• The intensity, I, of polarized light after passing through a polarizing filter is I = I0 cos2 θ , where I0 is the
incident intensity and θ is the angle between the direction of polarization and the axis of the filter.


• Polarization is also produced by reflection.
• Brewster’s law states that reflected light is completely polarized at the angle of reflection θb , known as Brewster’s
angle.


• Polarization can also be produced by scattering.
• Several types of optically active substances rotate the direction of polarization of light passing through them.


CONCEPTUAL QUESTIONS
1.1 The Propagation of Light
1. Under what conditions can light be modeled like a ray?Like a wave?
2. Why is the index of refraction always greater than orequal to 1?
3. Does the fact that the light flash from lightning reachesyou before its sound prove that the speed of light isextremely large or simply that it is greater than the speedof sound? Discuss how you could use this effect to get anestimate of the speed of light.
4. Speculate as to what physical process might beresponsible for light traveling more slowly in a mediumthan in a vacuum.


1.2 The Law of Reflection
5. Using the law of reflection, explain how powder takesthe shine off of a person’s nose. What is the name of theoptical effect?


1.3 Refraction
6. Diffusion by reflection from a rough surface isdescribed in this chapter. Light can also be diffused byrefraction. Describe how this occurs in a specific situation,such as light interacting with crushed ice.
7. Will light change direction toward or away from theperpendicular when it goes from air to water? Water toglass? Glass to air?
8. Explain why an object in water always appears to be ata depth shallower than it actually is?
9. Explain why a person’s legs appear very short whenwading in a pool. Justify your explanation with a raydiagram showing the path of rays from the feet to the eyeof an observer who is out of the water.
10. Explain why an oar that is partially submerged inwater appears bent.


1.4 Total Internal Reflection
11. A ring with a colorless gemstone is dropped into water.


Chapter 1 | The Nature of Light 45




The gemstone becomes invisible when submerged. Can itbe a diamond? Explain.
12. The most common type of mirage is an illusion thatlight from faraway objects is reflected by a pool of waterthat is not really there. Mirages are generally observed indeserts, when there is a hot layer of air near the ground.Given that the refractive index of air is lower for air athigher temperatures, explain how mirages can be formed.
13. How can you use total internal reflection to estimatethe index of refraction of a medium?


1.5 Dispersion
14. Is it possible that total internal reflection plays a rolein rainbows? Explain in terms of indices of refraction andangles, perhaps referring to that shown below. Some ofus have seen the formation of a double rainbow; is itphysically possible to observe a triple rainbow?


15. A high-quality diamond may be quite clear andcolorless, transmitting all visible wavelengths with littleabsorption. Explain how it can sparkle with flashes ofbrilliant color when illuminated by white light.


1.6 Huygens’s Principle
16. How do wave effects depend on the size of the objectwith which the wave interacts? For example, why doessound bend around the corner of a building while light doesnot?


17. Does Huygens’s principle apply to all types of waves?
18. If diffraction is observed for some phenomenon, it isevidence that the phenomenon is a wave. Does the reversehold true? That is, if diffraction is not observed, does thatmean the phenomenon is not a wave?


1.7 Polarization
19. Can a sound wave in air be polarized? Explain.
20. No light passes through two perfect polarizing filterswith perpendicular axes. However, if a third polarizingfilter is placed between the original two, some light canpass. Why is this? Under what circumstances does most ofthe light pass?
21. Explain what happens to the energy carried by lightthat it is dimmed by passing it through two crossedpolarizing filters.
22. When particles scattering light are much smaller thanits wavelength, the amount of scattering is proportional to
1
λ
. Does this mean there is more scattering for small λ


than large λ ? How does this relate to the fact that the sky
is blue?
23. Using the information given in the preceding question,explain why sunsets are red.
24. When light is reflected at Brewster’s angle from asmooth surface, it is 100% polarized parallel to the
surface. Part of the light will be refracted into the surface.Describe how you would do an experiment to determinethe polarization of the refracted light. What direction wouldyou expect the polarization to have and would you expect itto be 100% ?
25. If you lie on a beach looking at the water with yourhead tipped slightly sideways, your polarized sunglasses donot work very well. Why not?


PROBLEMS
1.1 The Propagation of Light
26. What is the speed of light in water? In glycerine?
27. What is the speed of light in air? In crown glass?
28. Calculate the index of refraction for a medium in


which the speed of light is 2.012 × 108 m/s, and identify
the most likely substance based on Table 1.1.
29. In what substance in Table 1.1 is the speed of light
2.290 × 108 m/s?


46 Chapter 1 | The Nature of Light


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




30. There was a major collision of an asteroid with theMoon in medieval times. It was described by monks atCanterbury Cathedral in England as a red glow on andaround the Moon. How long after the asteroid hit the Moon,
which is 3.84 × 105 km away, would the light first arrive
on Earth?
31. Components of some computers communicate witheach other through optical fibers having an index ofrefraction n = 1.55. What time in nanoseconds is required
for a signal to travel 0.200 m through such a fiber?
32. Compare the time it takes for light to travel 1000 m onthe surface of Earth and in outer space.
33. How far does light travel underwater during a time
interval of 1.50 × 10−6 s ?


1.2 The Law of Reflection
34. Suppose a man stands in front of a mirror as shownbelow. His eyes are 1.65 m above the floor and the top ofhis head is 0.13 m higher. Find the height above the floorof the top and bottom of the smallest mirror in which hecan see both the top of his head and his feet. How is thisdistance related to the man’s height?


35. Show that when light reflects from two mirrors thatmeet each other at a right angle, the outgoing ray is parallelto the incoming ray, as illustrated below.


36. On the Moon’s surface, lunar astronauts placed acorner reflector, off which a laser beam is periodicallyreflected. The distance to the Moon is calculated from theround-trip time. What percent correction is needed toaccount for the delay in time due to the slowing of lightin Earth’s atmosphere? Assume the distance to the Moon
is precisely 3.84 × 108 m and Earth’s atmosphere (which
varies in density with altitude) is equivalent to a layer30.0 km thick with a constant index of refraction
n = 1.000293.


37. A flat mirror is neither converging nor diverging. Toprove this, consider two rays originating from the samepoint and diverging at an angle θ (see below). Show that
after striking a plane mirror, the angle between theirdirections remains θ.


1.3 Refraction
Unless otherwise specified, for problems 1 through 10, theindices of refraction of glass and water should be taken tobe 1.50 and 1.333, respectively.
38. A light beam in air has an angle of incidence of 35°
at the surface of a glass plate. What are the angles ofreflection and refraction?


Chapter 1 | The Nature of Light 47




39. A light beam in air is incident on the surface of a pond,making an angle of 20° with respect to the surface. What
are the angles of reflection and refraction?
40. When a light ray crosses from water into glass, itemerges at an angle of 30° with respect to the normal of
the interface. What is its angle of incidence?
41. A pencil flashlight submerged in water sends a lightbeam toward the surface at an angle of incidence of 30° .
What is the angle of refraction in air?
42. Light rays from the Sun make a 30° angle to the
vertical when seen from below the surface of a body ofwater. At what angle above the horizon is the Sun?
43. The path of a light beam in air goes from an angle ofincidence of 35° to an angle of refraction of 22° when it
enters a rectangular block of plastic. What is the index ofrefraction of the plastic?
44. A scuba diver training in a pool looks at his instructoras shown below. What angle does the ray from theinstructor’s face make with the perpendicular to the waterat the point where the ray enters? The angle between the rayin the water and the perpendicular to the water is 25.0° .


45. (a) Using information in the preceding problem, findthe height of the instructor’s head above the water, notingthat you will first have to calculate the angle of incidence.(b) Find the apparent depth of the diver’s head below wateras seen by the instructor.


1.4 Total Internal Reflection
46. Verify that the critical angle for light going from water


to air is 48.6° , as discussed at the end of Example 1.4,
regarding the critical angle for light traveling in apolystyrene (a type of plastic) pipe surrounded by air.
47. (a) At the end of Example 1.4, it was stated that thecritical angle for light going from diamond to air is 24.4°.
Verify this. (b) What is the critical angle for light goingfrom zircon to air?
48. An optical fiber uses flint glass clad with crown glass.What is the critical angle?
49. At what minimum angle will you get total internalreflection of light traveling in water and reflected from ice?
50. Suppose you are using total internal reflection to makean efficient corner reflector. If there is air outside and theincident angle is 45.0° , what must be the minimum index
of refraction of the material from which the reflector ismade?
51. You can determine the index of refraction of asubstance by determining its critical angle. (a) What is theindex of refraction of a substance that has a critical angleof 68.4° when submerged in water? What is the substance,
based on Table 1.1? (b) What would the critical angle befor this substance in air?
52. A ray of light, emitted beneath the surface of anunknown liquid with air above it, undergoes total internalreflection as shown below. What is the index of refractionfor the liquid and its likely identification?


53. Light rays fall normally on the vertical surface ofthe glass prism (n = 1.50) shown below. (a) What is the
largest value for ϕ such that the ray is totally reflected at
the slanted face? (b) Repeat the calculation of part (a) if theprism is immersed in water.


48 Chapter 1 | The Nature of Light


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




1.5 Dispersion
54. (a) What is the ratio of the speed of red light to violetlight in diamond, based on Table 1.2? (b) What is this ratioin polystyrene? (c) Which is more dispersive?
55. A beam of white light goes from air into water at anincident angle of 75.0° . At what angles are the red (660
nm) and violet (410 nm) parts of the light refracted?
56. By how much do the critical angles for red (660 nm)and violet (410 nm) light differ in a diamond surrounded byair?
57. (a) A narrow beam of light containing yellow (580nm) and green (550 nm) wavelengths goes frompolystyrene to air, striking the surface at a 30.0° incident
angle. What is the angle between the colors when theyemerge? (b) How far would they have to travel to beseparated by 1.00 mm?
58. A parallel beam of light containing orange (610 nm)and violet (410 nm) wavelengths goes from fused quartzto water, striking the surface between them at a 60.0°
incident angle. What is the angle between the two colors inwater?
59. A ray of 610-nm light goes from air into fused quartzat an incident angle of 55.0° . At what incident angle must
470 nm light enter flint glass to have the same angle ofrefraction?
60. A narrow beam of light containing red (660 nm) andblue (470 nm) wavelengths travels from air through a1.00-cm-thick flat piece of crown glass and back to airagain. The beam strikes at a 30.0° incident angle. (a) At
what angles do the two colors emerge? (b) By what distanceare the red and blue separated when they emerge?


61. A narrow beam of white light enters a prism made ofcrown glass at a 45.0° incident angle, as shown below. At
what angles, θR and θV , do the red (660 nm) and violet
(410 nm) components of the light emerge from the prism?


1.7 Polarization
62. What angle is needed between the direction ofpolarized light and the axis of a polarizing filter to cut itsintensity in half?
63. The angle between the axes of two polarizing filtersis 45.0° . By how much does the second filter reduce the
intensity of the light coming through the first?
64. Two polarizing sheets P1 and P2 are placed together
with their transmission axes oriented at an angle θ to
each other. What is θ when only 25% of the maximum
transmitted light intensity passes through them?
65. Suppose that in the preceding problem the lightincident on P1 is unpolarized. At the determined value of
θ , what fraction of the incident light passes through the
combination?
66. If you have completely polarized light of intensity
150W/m2 , what will its intensity be after passing through
a polarizing filter with its axis at an 89.0° angle to the
light’s polarization direction?
67. What angle would the axis of a polarizing filter needto make with the direction of polarized light of intensity
1.00 kW/m2 to reduce the intensity to 10.0W/m2 ?
68. At the end of Example 1.7, it was stated that theintensity of polarized light is reduced to 90.0% of its
original value by passing through a polarizing filter with itsaxis at an angle of 18.4° to the direction of polarization.
Verify this statement.
69. Show that if you have three polarizing filters, with thesecond at an angle of 45.0° to the first and the third at an
angle of 90.0° to the first, the intensity of light passed by


Chapter 1 | The Nature of Light 49




the first will be reduced to 25.0% of its value. (This is in
contrast to having only the first and third, which reduces theintensity to zero, so that placing the second between themincreases the intensity of the transmitted light.)
70. Three polarizing sheets are placed together such thatthe transmission axis of the second sheet is oriented at
25.0° to the axis of the first, whereas the transmission
axis of the third sheet is oriented at 40.0° (in the same
sense) to the axis of the first. What fraction of the intensityof an incident unpolarized beam is transmitted by thecombination?
71. In order to rotate the polarization axis of a beam oflinearly polarized light by 90.0° , a student places sheets
P1 and P2 with their transmission axes at 45.0° and
90.0° , respectively, to the beam’s axis of polarization. (a)
What fraction of the incident light passes through P1 and


(b) through the combination? (c) Repeat your calculationsfor part (b) for transmission-axis angles of 30.0° and
90.0° , respectively.
72. It is found that when light traveling in water falls ona plastic block, Brewster’s angle is 50.0° . What is the
refractive index of the plastic?
73. At what angle will light reflected from diamond becompletely polarized?
74. What is Brewster’s angle for light traveling in waterthat is reflected from crown glass?
75. A scuba diver sees light reflected from the water’ssurface. At what angle will this light be completelypolarized?


ADDITIONAL PROBLEMS
76. From his measurements, Roemer estimated that it took22 min for light to travel a distance equal to the diameterof Earth’s orbit around the Sun. (a) Use this estimate alongwith the known diameter of Earth’s orbit to obtain a roughvalue of the speed of light. (b) Light actually takes 16.5 minto travel this distance. Use this time to calculate the speedof light.
77. Cornu performed Fizeau’s measurement of the speedof light using a wheel of diameter 4.00 cm that contained180 teeth. The distance from the wheel to the mirror was22.9 km. Assuming he measured the speed of lightaccurately, what was the angular velocity of the wheel?
78. Suppose you have an unknown clear substanceimmersed in water, and you wish to identify it by findingits index of refraction. You arrange to have a beam of lightenter it at an angle of 45.0° , and you observe the angle of
refraction to be 40.3° . What is the index of refraction of
the substance and its likely identity?
79. Shown below is a ray of light going from air throughcrown glass into water, such as going into a fish tank.Calculate the amount the ray is displaced by the glass
(Δx), given that the incident angle is 40.0° and the glass
is 1.00 cm thick.


80. Considering the previous problem, show that θ3 is the
same as it would be if the second medium were not present.
81. At what angle is light inside crown glass completelypolarized when reflected from water, as in a fish tank?
82. Light reflected at 55.6° from a window is completely
polarized. What is the window’s index of refraction and thelikely substance of which it is made?
83. (a) Light reflected at 62.5° from a gemstone in a ring
is completely polarized. Can the gem be a diamond? (b) Atwhat angle would the light be completely polarized if the


50 Chapter 1 | The Nature of Light


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




gem was in water?
84. If θb is Brewster’s angle for light reflected from the
top of an interface between two substances, and θb′ is
Brewster’s angle for light reflected from below, prove that
θb + θb


′ = 90.0° .
85. Unreasonable results Suppose light travels fromwater to another substance, with an angle of incidence of
10.0° and an angle of refraction of 14.9° . (a) What is
the index of refraction of the other substance? (b) What isunreasonable about this result? (c) Which assumptions areunreasonable or inconsistent?
86. Unreasonable results Light traveling from water toa gemstone strikes the surface at an angle of 80.0° and
has an angle of refraction of 15.2° . (a) What is the speed
of light in the gemstone? (b) What is unreasonable aboutthis result? (c) Which assumptions are unreasonable orinconsistent?


87. If a polarizing filter reduces the intensity of polarizedlight to 50.0% of its original value, by how much are the
electric and magnetic fields reduced?
88. Suppose you put on two pairs of polarizing sunglasseswith their axes at an angle of 15.0° . How much longer will
it take the light to deposit a given amount of energy in youreye compared with a single pair of sunglasses? Assume thelenses are clear except for their polarizing characteristics.
89. (a) On a day when the intensity of sunlight is
1.00 kW/m2 , a circular lens 0.200 m in diameter focuses
light onto water in a black beaker. Two polarizing sheetsof plastic are placed in front of the lens with their axes atan angle of 20.0° . Assuming the sunlight is unpolarized
and the polarizers are 100% efficient, what is the initial
rate of heating of the water in °C/s , assuming it is 80.0%
absorbed? The aluminum beaker has a mass of 30.0 gramsand contains 250 grams of water. (b) Do the polarizingfilters get hot? Explain.


CHALLENGE PROBLEMS
90. Light shows staged with lasers use moving mirrors toswing beams and create colorful effects. Show that a lightray reflected from a mirror changes direction by 2θ when
the mirror is rotated by an angle θ .
91. Consider sunlight entering Earth’s atmosphere atsunrise and sunset—that is, at a 90.0° incident angle.
Taking the boundary between nearly empty space and theatmosphere to be sudden, calculate the angle of refractionfor sunlight. This lengthens the time the Sun appears tobe above the horizon, both at sunrise and sunset. Nowconstruct a problem in which you determine the angle ofrefraction for different models of the atmosphere, such asvarious layers of varying density. Your instructor may wishto guide you on the level of complexity to consider and onhow the index of refraction varies with air density.
92. A light ray entering an optical fiber surrounded by airis first refracted and then reflected as shown below. Showthat if the fiber is made from crown glass, any incident raywill be totally internally reflected.


93. A light ray falls on the left face of a prism (see below)


at the angle of incidence θ for which the emerging beam
has an angle of refraction θ at the right face. Show that the
index of refraction n of the glass prism is given by
n =


sin 1
2
(α + ϕ)


sin 1
2
ϕ


where ϕ is the vertex angle of the prism and α is the angle
through which the beam has been deviated. If α = 37.0°
and the base angles of the prism are each 50.0°, what is
n?


94. If the apex angle ϕ in the previous problem is 20.0°
and n = 1.50 , what is the value of α ?
95. The light incident on polarizing sheet P1 is linearly


Chapter 1 | The Nature of Light 51




polarized at an angle of 30.0° with respect to the
transmission axis of P1 . Sheet P2 is placed so that its
axis is parallel to the polarization axis of the incident light,that is, also at 30.0° with respect to P1 . (a) What fraction
of the incident light passes through P1 ? (b) What fraction
of the incident light is passed by the combination? (c)By rotating P2 , a maximum in transmitted intensity is
obtained. What is the ratio of this maximum intensity tothe intensity of transmitted light when P2 is at 30.0° with


respect to P1 ?
96. Prove that if I is the intensity of light transmitted bytwo polarizing filters with axes at an angle θ and I′ is
the intensity when the axes are at an angle 90.0° − θ,
then I + I ′ = I0, the original intensity. (Hint: Use the
trigonometric identities cos 90.0° − θ = sin θ and
cos2 θ + sin2 θ = 1.)


52 Chapter 1 | The Nature of Light


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




2 | GEOMETRIC OPTICSAND IMAGE FORMATION


Figure 2.1 Cloud Gate is a public sculpture by Anish Kapoor located in Millennium Park in Chicago. Its stainless steel platesreflect and distort images around it, including the Chicago skyline. Dedicated in 2006, it has become a popular tourist attraction,illustrating how art can use the principles of physical optics to startle and entertain. (credit: modification of work by DhilungKirat)


Chapter Outline
2.1 Images Formed by Plane Mirrors
2.2 Spherical Mirrors
2.3 Images Formed by Refraction
2.4 Thin Lenses
2.5 The Eye
2.6 The Camera
2.7 The Simple Magnifier
2.8 Microscopes and Telescopes


Introduction
This chapter introduces the major ideas of geometric optics, which describe the formation of images due to reflection andrefraction. It is called “geometric” optics because the images can be characterized using geometric constructions, such asray diagrams. We have seen that visible light is an electromagnetic wave; however, its wave nature becomes evident onlywhen light interacts with objects with dimensions comparable to the wavelength (about 500 nm for visible light). Therefore,the laws of geometric optics only apply to light interacting with objects much larger than the wavelength of the light.


Chapter 2 | Geometric Optics and Image Formation 53




2.1 | Images Formed by Plane Mirrors
Learning Objectives


By the end of this section, you will be able to:
• Describe how an image is formed by a plane mirror.
• Distinguish between real and virtual images.
• Find the location and characterize the orientation of an image created by a plane mirror.


You only have to look as far as the nearest bathroom to find an example of an image formed by a mirror. Images in a planemirror are the same size as the object, are located behind the mirror, and are oriented in the same direction as the object(i.e., “upright”).
To understand how this happens, consider Figure 2.2. Two rays emerge from point P, strike the mirror, and reflect intothe observer’s eye. Note that we use the law of reflection to construct the reflected rays. If the reflected rays are extendedbackward behind the mirror (see dashed lines in Figure 2.2), they seem to originate from point Q. This is where the imageof point P is located. If we repeat this process for point P′ , we obtain its image at point Q′ . You should convince yourself
by using basic geometry that the image height (the distance from Q to Q′ ) is the same as the object height (the distance
from P to P′ ). By forming images of all points of the object, we obtain an upright image of the object behind the mirror.


Figure 2.2 Two light rays originating from point P on an object are reflected by a flat mirrorinto the eye of an observer. The reflected rays are obtained by using the law of reflection.Extending these reflected rays backward, they seem to come from point Q behind the mirror,which is where the virtual image is located. Repeating this process for point P′ gives the
image point Q′ . The image height is thus the same as the object height, the image is upright,
and the object distance do is the same as the image distance di . (credit: modification of work
by Kevin Dufendach)


Notice that the reflected rays appear to the observer to come directly from the image behind the mirror. In reality, these rayscome from the points on the mirror where they are reflected. The image behind the mirror is called a virtual image becauseit cannot be projected onto a screen—the rays only appear to originate from a common point behind the mirror. If you walkbehind the mirror, you cannot see the image, because the rays do not go there. However, in front of the mirror, the raysbehave exactly as if they come from behind the mirror, so that is where the virtual image is located.
Later in this chapter, we discuss real images; a real image can be projected onto a screen because the rays physically gothrough the image. You can certainly see both real and virtual images. The difference is that a virtual image cannot beprojected onto a screen, whereas a real image can.


54 Chapter 2 | Geometric Optics and Image Formation


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Locating an Image in a Plane Mirror
The law of reflection tells us that the angle of incidence is the same as the angle of reflection. Applying this to triangles PABand QAB in Figure 2.2 and using basic geometry shows that they are congruent triangles. This means that the distance PBfrom the object to the mirror is the same as the distance BQ from the mirror to the image. The object distance (denoted
do ) is the distance from the mirror to the object (or, more generally, from the center of the optical element that creates its
image). Similarly, the image distance (denoted di ) is the distance from the mirror to the image (or, more generally, from
the center of the optical element that creates it). If we measure distances from the mirror, then the object and image are inopposite directions, so for a plane mirror, the object and image distances should have the opposite signs:


(2.1)do = −di.


An extended object such as the container in Figure 2.2 can be treated as a collection of points, and we can apply themethod above to locate the image of each point on the extended object, thus forming the extended image.
Multiple Images
If an object is situated in front of two mirrors, you may see images in both mirrors. In addition, the image in the first mirrormay act as an object for the second mirror, so the second mirror may form an image of the image. If the mirrors are placedparallel to each other and the object is placed at a point other than the midpoint between them, then this process of image-of-an-image continues without end, as you may have noticed when standing in a hallway with mirrors on each side. This isshown in Figure 2.3, which shows three images produced by the blue object. Notice that each reflection reverses front andback, just like pulling a right-hand glove inside out produces a left-hand glove (this is why a reflection of your right handis a left hand). Thus, the fronts and backs of images 1 and 2 are both inverted with respect to the object, and the front andback of image 3 is inverted with respect to image 2, which is the object for image 3.


Figure 2.3 Two parallel mirrors can produce, in theory, an infinite number of images of an objectplaced off center between the mirrors. Three of these images are shown here. The front and back ofeach image is inverted with respect to its object. Note that the colors are only to identify the images.For normal mirrors, the color of an image is essentially the same as that of its object.


You may have noticed that image 3 is smaller than the object, whereas images 1 and 2 are the same size as the object. Theratio of the image height with respect to the object height is called magnification. More will be said about magnification inthe next section.
Infinite reflections may terminate. For instance, two mirrors at right angles form three images, as shown in part (a) ofFigure 2.4. Images 1 and 2 result from rays that reflect from only a single mirror, but image 1,2 is formed by rays thatreflect from both mirrors. This is shown in the ray-tracing diagram in part (b) of Figure 2.4. To find image 1,2, you haveto look behind the corner of the two mirrors.


Chapter 2 | Geometric Optics and Image Formation 55




Figure 2.4 Two mirrors can produce multiple images. (a) Three images of a plastic head are visible in the two mirrors at a rightangle. (b) A single object reflecting from two mirrors at a right angle can produce three images, as shown by the green, purple,and red images.


2.2 | Spherical Mirrors
Learning Objectives


By the end of this section, you will be able to:
• Describe image formation by spherical mirrors.
• Use ray diagrams and the mirror equation to calculate the properties of an image in a sphericalmirror.


The image in a plane mirror has the same size as the object, is upright, and is the same distance behind the mirror as theobject is in front of the mirror. A curved mirror, on the other hand, can form images that may be larger or smaller than theobject and may form either in front of the mirror or behind it. In general, any curved surface will form an image, althoughsome images make be so distorted as to be unrecognizable (think of fun house mirrors).
Because curved mirrors can create such a rich variety of images, they are used in many optical devices that find many uses.We will concentrate on spherical mirrors for the most part, because they are easier to manufacture than mirrors such asparabolic mirrors and so are more common.
Curved Mirrors
We can define two general types of spherical mirrors. If the reflecting surface is the outer side of the sphere, the mirror iscalled a convex mirror. If the inside surface is the reflecting surface, it is called a concave mirror.
Symmetry is one of the major hallmarks of many optical devices, including mirrors and lenses. The symmetry axis of suchoptical elements is often called the principal axis or optical axis. For a spherical mirror, the optical axis passes through themirror’s center of curvature and the mirror’s vertex, as shown in Figure 2.5.


56 Chapter 2 | Geometric Optics and Image Formation


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 2.5 A spherical mirror is formed by cutting out a piece of a sphere and silvering either theinside or outside surface. A concave mirror has silvering on the interior surface (think “cave”), and aconvex mirror has silvering on the exterior surface.


Consider rays that are parallel to the optical axis of a parabolic mirror, as shown in part (a) of Figure 2.6. Following thelaw of reflection, these rays are reflected so that they converge at a point, called the focal point. Part (b) of this figure showsa spherical mirror that is large compared with its radius of curvature. For this mirror, the reflected rays do not cross at thesame point, so the mirror does not have a well-defined focal point. This is called spherical aberration and results in a blurredimage of an extended object. Part (c) shows a spherical mirror that is small compared to its radius of curvature. This mirroris a good approximation of a parabolic mirror, so rays that arrive parallel to the optical axis are reflected to a well-definedfocal point. The distance along the optical axis from the mirror to the focal point is called the focal length of the mirror.


Figure 2.6 (a) Parallel rays reflected from a parabolic mirror cross at a single point called thefocal point F. (b) Parallel rays reflected from a large spherical mirror do not cross at a commonpoint. (c) If a spherical mirror is small compared with its radius of curvature, it betterapproximates the central part of a parabolic mirror, so parallel rays essentially cross at acommon point. The distance along the optical axis from the mirror to the focal point is the focallength f of the mirror.


A convex spherical mirror also has a focal point, as shown in Figure 2.7. Incident rays parallel to the optical axis arereflected from the mirror and seem to originate from point F at focal length f behind the mirror. Thus, the focal point isvirtual because no real rays actually pass through it; they only appear to originate from it.


Chapter 2 | Geometric Optics and Image Formation 57




Figure 2.7 (a) Rays reflected by a convex spherical mirror: Incident rays of light parallel to theoptical axis are reflected from a convex spherical mirror and seem to originate from a well-definedfocal point at focal distance f on the opposite side of the mirror. The focal point is virtual because noreal rays pass through it. (b) Photograph of a virtual image formed by a convex mirror. (credit b:modification of work by Jenny Downing)


How does the focal length of a mirror relate to the mirror’s radius of curvature? Figure 2.8 shows a single ray that isreflected by a spherical concave mirror. The incident ray is parallel to the optical axis. The point at which the reflected raycrosses the optical axis is the focal point. Note that all incident rays that are parallel to the optical axis are reflected throughthe focal point—we only show one ray for simplicity. We want to find how the focal length FP (denoted by f) relates tothe radius of curvature of the mirror, R, whose length is R = CF + FP . The law of reflection tells us that angles OXC
and CXF are the same, and because the incident ray is parallel to the optical axis, angles OXC and XCP are also the same.Thus, triangle CXF is an isosceles triangle with CF = FX . If the angle θ is small (so that sin θ ≈ θ ; this is called the
“small-angle approximation”), then FX ≈ FP or CF ≈ FP . Inserting this into the equation for the radius R, we get


R = CF + FP = FP + FP = 2FP = 2 f


Figure 2.8 Reflection in a concave mirror. In the small-angleapproximation, a ray that is parallel to the optical axis CP isreflected through the focal point F of the mirror.


In other words, in the small-angle approximation, the focal length f of a concave spherical mirror is half of its radius ofcurvature, R:


58 Chapter 2 | Geometric Optics and Image Formation


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




(2.2)f = R
2
.


In this chapter, we assume that the small-angle approximation (also called the paraxial approximation) is always valid.In this approximation, all rays are paraxial rays, which means that they make a small angle with the optical axis and are ata distance much less than the radius of curvature from the optical axis. In this case, their angles θ of reflection are small
angles, so sin θ ≈ tan θ ≈ θ .
Using Ray Tracing to Locate Images
To find the location of an image formed by a spherical mirror, we first use ray tracing, which is the technique of drawingrays and using the law of reflection to determine the reflected rays (later, for lenses, we use the law of refraction to determinerefracted rays). Combined with some basic geometry, we can use ray tracing to find the focal point, the image location,and other information about how a mirror manipulates light. In fact, we already used ray tracing above to locate the focalpoint of spherical mirrors, or the image distance of flat mirrors. To locate the image of an object, you must locate at leasttwo points of the image. Locating each point requires drawing at least two rays from a point on the object and constructingtheir reflected rays. The point at which the reflected rays intersect, either in real space or in virtual space, is where thecorresponding point of the image is located. To make ray tracing easier, we concentrate on four “principal” rays whosereflections are easy to construct.
Figure 2.9 shows a concave mirror and a convex mirror, each with an arrow-shaped object in front of it. These are theobjects whose images we want to locate by ray tracing. To do so, we draw rays from point Q that is on the object but not onthe optical axis. We choose to draw our ray from the tip of the object. Principal ray 1 goes from point Q and travels parallelto the optical axis. The reflection of this ray must pass through the focal point, as discussed above. Thus, for the concavemirror, the reflection of principal ray 1 goes through focal point F, as shown in part (b) of the figure. For the convex mirror,the backward extension of the reflection of principal ray 1 goes through the focal point (i.e., a virtual focus). Principalray 2 travels first on the line going through the focal point and then is reflected back along a line parallel to the opticalaxis. Principal ray 3 travels toward the center of curvature of the mirror, so it strikes the mirror at normal incidence and isreflected back along the line from which it came. Finally, principal ray 4 strikes the vertex of the mirror and is reflectedsymmetrically about the optical axis.


Chapter 2 | Geometric Optics and Image Formation 59




Figure 2.9 The four principal rays shown for both (a) a concave mirror and (b) a convex mirror. Theimage forms where the rays intersect (for real images) or where their backward extensions intersect (forvirtual images).


The four principal rays intersect at point Q′ , which is where the image of point Q is located. To locate point Q′ , drawing
any two of these principle rays would suffice. We are thus free to choose whichever of the principal rays we desire to locatethe image. Drawing more than two principal rays is sometimes useful to verify that the ray tracing is correct.
To completely locate the extended image, we need to locate a second point in the image, so that we know how the imageis oriented. To do this, we trace the principal rays from the base of the object. In this case, all four principal rays run alongthe optical axis, reflect from the mirror, and then run back along the optical axis. The difficulty is that, because these raysare collinear, we cannot determine a unique point where they intersect. All we know is that the base of the image is on theoptical axis. However, because the mirror is symmetrical from top to bottom, it does not change the vertical orientation ofthe object. Thus, because the object is vertical, the image must be vertical. Therefore, the image of the base of the object ison the optical axis directly above the image of the tip, as drawn in the figure.
For the concave mirror, the extended image in this case forms between the focal point and the center of curvature of themirror. It is inverted with respect to the object, is a real image, and is smaller than the object. Were we to move the objectcloser to or farther from the mirror, the characteristics of the image would change. For example, we show, as a later exercise,that an object placed between a concave mirror and its focal point leads to a virtual image that is upright and larger than theobject. For the convex mirror, the extended image forms between the focal point and the mirror. It is upright with respect tothe object, is a virtual image, and is smaller than the object.
Summary of Ray-Tracing Rules
Ray tracing is very useful for mirrors. The rules for ray tracing are summarized here for reference:


• A ray travelling parallel to the optical axis of a spherical mirror is reflected along a line that goes through the focal


60 Chapter 2 | Geometric Optics and Image Formation


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




point of the mirror (ray 1 in Figure 2.9).
• A ray travelling along a line that goes through the focal point of a spherical mirror is reflected along a line parallelto the optical axis of the mirror (ray 2 in Figure 2.9).
• A ray travelling along a line that goes through the center of curvature of a spherical mirror is reflected back alongthe same line (ray 3 in Figure 2.9).
• A ray that strikes the vertex of a spherical mirror is reflected symmetrically about the optical axis of the mirror (ray4 in Figure 2.9).


We use ray tracing to illustrate how images are formed by mirrors and to obtain numerical information about opticalproperties of the mirror. If we assume that a mirror is small compared with its radius of curvature, we can also use algebraand geometry to derive a mirror equation, which we do in the next section. Combining ray tracing with the mirror equationis a good way to analyze mirror systems.
Image Formation by Reflection—The Mirror Equation
For a plane mirror, we showed that the image formed has the same height and orientation as the object, and it is located atthe same distance behind the mirror as the object is in front of the mirror. Although the situation is a bit more complicatedfor curved mirrors, using geometry leads to simple formulas relating the object and image distances to the focal lengths ofconcave and convex mirrors.
Consider the object OP shown in Figure 2.10. The center of curvature of the mirror is labeled C and is a distance R fromthe vertex of the mirror, as marked in the figure. The object and image distances are labeled do and di , and the object
and image heights are labeled ho and hi , respectively. Because the angles ϕ and ϕ′ are alternate interior angles, we
know that they have the same magnitude. However, they must differ in sign if we measure angles from the optical axis, so
ϕ = −ϕ′ . An analogous scenario holds for the angles θ and θ′ . The law of reflection tells us that they have the same
magnitude, but their signs must differ if we measure angles from the optical axis. Thus, θ = −θ′ . Taking the tangent of the
angles θ and θ′ , and using the property that tan (−θ) = −tan θ , gives us


(2.3)tan θ = ho
do


tan θ′ = −tan θ =
hi
di










ho
do


= −
hi
di


or − ho
hi


= do
di


.


Figure 2.10 Image formed by a concave mirror.


Similarly, taking the tangent of ϕ and ϕ′ gives
tanϕ = ho


do − R


tanϕ′ = −tanϕ =
hi


R − di










ho


do − R
= −


hi
R − di


or − ho
hi


= do − R
R − di


.


Combining these two results gives


Chapter 2 | Geometric Optics and Image Formation 61




do
di


= do − R
R − di


.


After a little algebra, this becomes
(2.4)1


do
+ 1
di


= 2
R
.


No approximation is required for this result, so it is exact. However, as discussed above, in the small-angle approximation,the focal length of a spherical mirror is one-half the radius of curvature of the mirror, or f = R/2 . Inserting this into
Equation 2.3 gives the mirror equation:


(2.5)1
do


+ 1
di


= 1
f
.


The mirror equation relates the image and object distances to the focal distance and is valid only in the small-angleapproximation. Although it was derived for a concave mirror, it also holds for convex mirrors (proving this is left as anexercise). We can extend the mirror equation to the case of a plane mirror by noting that a plane mirror has an infinite radiusof curvature. This means the focal point is at infinity, so the mirror equation simplifies to
(2.6)do = −di


which is the same as Equation 2.1 obtained earlier.
Notice that we have been very careful with the signs in deriving the mirror equation. For a plane mirror, the image distancehas the opposite sign of the object distance. Also, the real image formed by the concave mirror in Figure 2.10 is on theopposite side of the optical axis with respect to the object. In this case, the image height should have the opposite signof the object height. To keep track of the signs of the various quantities in the mirror equation, we now introduce a signconvention.
Sign convention for spherical mirrors
Using a consistent sign convention is very important in geometric optics. It assigns positive or negative values for thequantities that characterize an optical system. Understanding the sign convention allows you to describe an image withoutconstructing a ray diagram. This text uses the following sign convention:


1. The focal length f is positive for concave mirrors and negative for convex mirrors.
2. The image distance di is positive for real images and negative for virtual images.


Notice that rule 1 means that the radius of curvature of a spherical mirror can be positive or negative. What does it meanto have a negative radius of curvature? This means simply that the radius of curvature for a convex mirror is defined to benegative.
Image magnification
Let’s use the sign convention to further interpret the derivation of the mirror equation. In deriving this equation, we foundthat the object and image heights are related by


(2.7)−ho
hi


= do
di


.


See Equation 2.3. Both the object and the image formed by the mirror in Figure 2.10 are real, so the object and imagedistances are both positive. The highest point of the object is above the optical axis, so the object height is positive. Theimage, however, is below the optical axis, so the image height is negative. Thus, this sign convention is consistent with ourderivation of the mirror equation.
Equation 2.7 in fact describes the linear magnification (often simply called “magnification”) of the image in terms of theobject and image distances. We thus define the dimensionless magnification m as follows:


(2.8)
m =


hi
ho


.


If m is positive, the image is upright, and if m is negative, the image is inverted. If |m| > 1 , the image is larger than the


62 Chapter 2 | Geometric Optics and Image Formation


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




object, and if |m| < 1 , the image is smaller than the object. With this definition of magnification, we get the following
relation between the vertical and horizontal object and image distances:


(2.9)
m =


hi
ho


= −do
di


.


This is a very useful relation because it lets you obtain the magnification of the image from the object and image distances,which you can obtain from the mirror equation.
Example 2.1


Solar Electric Generating System
One of the solar technologies used today for generating electricity involves a device (called a parabolic troughor concentrating collector) that concentrates sunlight onto a blackened pipe that contains a fluid. This heatedfluid is pumped to a heat exchanger, where the thermal energy is transferred to another system that is used togenerate steam and eventually generates electricity through a conventional steam cycle. Figure 2.11 shows sucha working system in southern California. The real mirror is a parabolic cylinder with its focus located at the pipe;however, we can approximate the mirror as exactly one-quarter of a circular cylinder.


Figure 2.11 Parabolic trough collectors are used to generate electricity in southern California. (credit:“kjkolb”/Wikimedia Commons)
a. If we want the rays from the sun to focus at 40.0 cm from the mirror, what is the radius of the mirror?
b. What is the amount of sunlight concentrated onto the pipe, per meter of pipe length, assuming the


insolation (incident solar radiation) is 900 W/m2 ?
c. If the fluid-carrying pipe has a 2.00-cm diameter, what is the temperature increase of the fluid per meterof pipe over a period of 1 minute? Assume that all solar radiation incident on the reflector is absorbed bythe pipe, and that the fluid is mineral oil.


Strategy
First identify the physical principles involved. Part (a) is related to the optics of spherical mirrors. Part (b)involves a little math, primarily geometry. Part (c) requires an understanding of heat and density.
Solutiona. The sun is the object, so the object distance is essentially infinity: do = ∞ . The desired image distance


is di = 40.0 cm . We use the mirror equation to find the focal length of the mirror:


Chapter 2 | Geometric Optics and Image Formation 63




1
do


+ 1
di


= 1
f


f =


1
do


+ 1
di





−1


= ⎛⎝
1
∞ +


1
40.0 cm





−1


= 40.0 cm


Thus, the radius of the mirror is R = 2 f = 80.0 cm .
b. The insolation is 900 W/m2 . You must find the cross-sectional area A of the concave mirror, since the


power delivered is 900 W/m2 × A . The mirror in this case is a quarter-section of a cylinder, so the area
for a length L of the mirror is A = 1


4
(2πR)L . The area for a length of 1.00 m is then


A = π
2
R(1.00 m) = (3.14)


2
(0.800 m)(1.00 m) = 1.26 m2.


The insolation on the 1.00-m length of pipe is then

⎝9.00 × 10


2 W
m2



⎝1.26 m


2⎞
⎠ = 1130W.


c. The increase in temperature is given by Q = mcΔT . The mass m of the mineral oil in the one-meter
section of pipe is


m = ρV = ρπ⎛⎝
d
2



2
(1.00 m)


= ⎛⎝8.00 × 10
2 kg/m3⎞⎠(3.14)(0.0100 m)


2(1.00 m)


= 0.251 kg


Therefore, the increase in temperature in one minute is
ΔT = Q/mc


= (1130W)(60.0 s)⎛
⎝0.251 kg⎞⎠⎛⎝1670 J · kg/°C⎞⎠


= 162°C


Significance
An array of such pipes in the California desert can provide a thermal output of 250 MW on a sunny day, withfluids reaching temperatures as high as 400°C . We are considering only one meter of pipe here and ignoring heat
losses along the pipe.


Example 2.2
Image in a Convex Mirror
A keratometer is a device used to measure the curvature of the cornea of the eye, particularly for fitting contactlenses. Light is reflected from the cornea, which acts like a convex mirror, and the keratometer measures themagnification of the image. The smaller the magnification, the smaller the radius of curvature of the cornea. Ifthe light source is 12 cm from the cornea and the image magnification is 0.032, what is the radius of curvature ofthe cornea?
Strategy
If you find the focal length of the convex mirror formed by the cornea, then you know its radius of curvature (it’s


64 Chapter 2 | Geometric Optics and Image Formation


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




twice the focal length). The object distance is do = 12 cm and the magnification is m = 0.032 . First find the
image distance di and then solve for the focal length f.
Solution
Start with the equation for magnification, m = −di/do . Solving for di and inserting the given values yields


di = −mdo = −(0.032)(12 cm) = −0.384 cm


where we retained an extra significant figure because this is an intermediate step in the calculation. Solve themirror equation for the focal length f and insert the known values for the object and image distances. The result is
1
do


+ 1
di


= 1
f


f =


1
do


+ 1
di





−1


= ⎛⎝
1


12 cm
+ 1


−0.384 cm



−1


= −0.40 cm


The radius of curvature is twice the focal length, so
R = 2 f = −0.80 cm


Significance
The focal length is negative, so the focus is virtual, as expected for a concave mirror and a real object. The radiusof curvature found here is reasonable for a cornea. The distance from cornea to retina in an adult eye is about2.0 cm. In practice, corneas may not be spherical, which complicates the job of fitting contact lenses. Note thatthe image distance here is negative, consistent with the fact that the image is behind the mirror. Thus, the imageis virtual because no rays actually pass through it. In the problems and exercises, you will show that, for a fixedobject distance, a smaller radius of curvature corresponds to a smaller the magnification.


Problem-Solving Strategy: Spherical Mirrors
Step 1. First make sure that image formation by a spherical mirror is involved.
Step 2. Determine whether ray tracing, the mirror equation, or both are required. A sketch is very useful even if raytracing is not specifically required by the problem. Write symbols and known values on the sketch.
Step 3. Identify exactly what needs to be determined in the problem (identify the unknowns).
Step 4. Make a list of what is given or can be inferred from the problem as stated (identify the knowns).
Step 5. If ray tracing is required, use the ray-tracing rules listed near the beginning of this section.
Step 6. Most quantitative problems require using the mirror equation. Use the examples as guides for using the mirrorequation.
Step 7. Check to see whether the answer makes sense. Do the signs of object distance, image distance, and focal lengthcorrespond with what is expected from ray tracing? Is the sign of the magnification correct? Are the object and imagedistances reasonable?


Departure from the Small-Angle Approximation
The small-angle approximation is a cornerstone of the above discussion of image formation by a spherical mirror. Whenthis approximation is violated, then the image created by a spherical mirror becomes distorted. Such distortion is calledaberration. Here we briefly discuss two specific types of aberrations: spherical aberration and coma.
Spherical aberration
Consider a broad beam of parallel rays impinging on a spherical mirror, as shown in Figure 2.12.


Chapter 2 | Geometric Optics and Image Formation 65




Figure 2.12 (a) With spherical aberration, the rays that are farther from the optical axis and the rays that are closer tothe optical axis are focused at different points. Notice that the aberration gets worse for rays farther from the opticalaxis. (b) For comatic aberration, parallel rays that are not parallel to the optical axis are focused at different heights andat different focal lengths, so the image contains a “tail” like a comet (which is “coma” in Latin). Note that the coloredrays are only to facilitate viewing; the colors do not indicate the color of the light.


The farther from the optical axis the rays strike, the worse the spherical mirror approximates a parabolic mirror. Thus, theserays are not focused at the same point as rays that are near the optical axis, as shown in the figure. Because of sphericalaberration, the image of an extended object in a spherical mirror will be blurred. Spherical aberrations are characteristic ofthe mirrors and lenses that we consider in the following section of this chapter (more sophisticated mirrors and lenses areneeded to eliminate spherical aberrations).
Coma or comatic aberration
Coma is similar to spherical aberration, but arises when the incoming rays are not parallel to the optical axis, as shown inpart (b) of Figure 2.12. Recall that the small-angle approximation holds for spherical mirrors that are small compared totheir radius. In this case, spherical mirrors are good approximations of parabolic mirrors. Parabolic mirrors focus all raysthat are parallel to the optical axis at the focal point. However, parallel rays that are not parallel to the optical axis arefocused at different heights and at different focal lengths, as show in part (b) of Figure 2.12. Because a spherical mirroris symmetric about the optical axis, the various colored rays in this figure create circles of the corresponding color on thefocal plane.
Although a spherical mirror is shown in part (b) of Figure 2.12, comatic aberration occurs also for parabolic mirrors—itdoes not result from a breakdown in the small-angle approximation. Spherical aberration, however, occurs only for sphericalmirrors and is a result of a breakdown in the small-angle approximation. We will discuss both coma and spherical aberrationlater in this chapter, in connection with telescopes.


66 Chapter 2 | Geometric Optics and Image Formation


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




2.3 | Images Formed by Refraction
Learning Objectives


By the end of this section, you will be able to:
• Describe image formation by a single refracting surface
• Determine the location of an image and calculate its properties by using a ray diagram
• Determine the location of an image and calculate its properties by using the equation for asingle refracting surface


When rays of light propagate from one medium to another, these rays undergo refraction, which is when light waves are bentat the interface between two media. The refracting surface can form an image in a similar fashion to a reflecting surface,except that the law of refraction (Snell’s law) is at the heart of the process instead of the law of reflection.
Refraction at a Plane Interface—Apparent Depth
If you look at a straight rod partially submerged in water, it appears to bend at the surface (Figure 2.13). The reason behindthis curious effect is that the image of the rod inside the water forms a little closer to the surface than the actual position ofthe rod, so it does not line up with the part of the rod that is above the water. The same phenomenon explains why a fish inwater appears to be closer to the surface than it actually is.


Figure 2.13 Bending of a rod at a water-air interface. Point Pon the rod appears to be at point Q, which is where the image ofpoint P forms due to refraction at the air-water interface.


To study image formation as a result of refraction, consider the following questions:
1. What happens to the rays of light when they enter or pass through a different medium?
2. Do the refracted rays originating from a single point meet at some point or diverge away from each other?


To be concrete, we consider a simple system consisting of two media separated by a plane interface (Figure 2.14). Theobject is in one medium and the observer is in the other. For instance, when you look at a fish from above the water surface,the fish is in medium 1 (the water) with refractive index 1.33, and your eye is in medium 2 (the air) with refractive index1.00, and the surface of the water is the interface. The depth that you “see” is the image height hi and is called the apparent
depth. The actual depth of the fish is the object height ho .


Chapter 2 | Geometric Optics and Image Formation 67




Figure 2.14 Apparent depth due to refraction. The real objectat point P creates an image at point Q. The image is not at thesame depth as the object, so the observer sees the image at an“apparent depth.”


The apparent depth hi depends on the angle at which you view the image. For a view from above (the so-called “normal”
view), we can approximate the refraction angle θ to be small, and replace sin θ in Snell’s law by tan θ . With this
approximation, you can use the triangles ΔOPR and ΔOQR to show that the apparent depth is given by


(2.10)hi = ⎛⎝n2n1⎞⎠ho.


The derivation of this result is left as an exercise. Thus, a fish appears at 3/4 of the real depth when viewed from above.
Refraction at a Spherical Interface
Spherical shapes play an important role in optics primarily because high-quality spherical shapes are far easier tomanufacture than other curved surfaces. To study refraction at a single spherical surface, we assume that the medium withthe spherical surface at one end continues indefinitely (a “semi-infinite” medium).
Refraction at a convex surface
Consider a point source of light at point P in front of a convex surface made of glass (see Figure 2.15). Let R be the radiusof curvature, n1 be the refractive index of the medium in which object point P is located, and n2 be the refractive index
of the medium with the spherical surface. We want to know what happens as a result of refraction at this interface.


Figure 2.15 Refraction at a convex surface (n2 > n1) .


Because of the symmetry involved, it is sufficient to examine rays in only one plane. The figure shows a ray of light that


68 Chapter 2 | Geometric Optics and Image Formation


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




starts at the object point P, refracts at the interface, and goes through the image point P′ . We derive a formula relating the
object distance do , the image distance di , and the radius of curvature R.
Applying Snell’s law to the ray emanating from point P gives n1 sin θ1 = n2 sin θ2 . We work in the small-angle
approximation, so sin θ ≈ θ and Snell’s law then takes the form


n1 θ1 ≈ n2 θ2.


From the geometry of the figure, we see that
θ1 = α + ϕ, θ2 = ϕ − β.


Inserting these expressions into Snell’s law gives
n1 (α + ϕ) ≈ n2 (ϕ − β).


Using the diagram, we calculate the tangent of the angles α, β, and ϕ :
tan α ≈ h


do
, tan β ≈ h


di
, tanϕ ≈ h


R
.


Again using the small-angle approximation, we find that tan θ ≈ θ , so the above relationships become
α ≈ h


do
, β ≈ h


di
, ϕ ≈ h


R
.


Putting these angles into Snell’s law gives
n1


h
do


+ h
R

⎠ = n2


h
R


− h
di



⎠.


We can write this more conveniently as


(2.11)n1
do


+
n2
di


=
n2 − n1


R
.


If the object is placed at a special point called the first focus, or the object focus F1 , then the image is formed at infinity,
as shown in part (a) of Figure 2.16.


Figure 2.16 (a) First focus (called the “object focus”) for refraction at a convex surface. (b) Second focus (called “imagefocus”) for refraction at a convex surface.


We can find the location f1 of the first focus F1 by setting di = ∞ in the preceding equation.


Chapter 2 | Geometric Optics and Image Formation 69




(2.12)n1
f1


+
n2
∞ =


n2 − n1
R


(2.13)
f1 =


n1R
n2 − n1


Similarly, we can define a second focus or image focus F2 where the image is formed for an object that is far away [part
(b)]. The location of the second focus F2 is obtained from Equation 2.11 by setting do = ∞ :


n1
∞ +


n2
f2


=
n2 − n1


R


f2 =
n2R


n2 − n1
.


Note that the object focus is at a different distance from the vertex than the image focus because n1 ≠ n2 .
Sign convention for single refracting surfaces
Although we derived this equation for refraction at a convex surface, the same expression holds for a concave surface,provided we use the following sign convention:


1. R > 0 if surface is convex toward object; otherwise, R < 0.
2. di > 0 if image is real and on opposite side from the object; otherwise, di < 0.


2.4 | Thin Lenses
Learning Objectives


By the end of this section, you will be able to:
• Use ray diagrams to locate and describe the image formed by a lens
• Employ the thin-lens equation to describe and locate the image formed by a lens


Lenses are found in a huge array of optical instruments, ranging from a simple magnifying glass to a camera’s zoom lens tothe eye itself. In this section, we use the Snell’s law to explore the properties of lenses and how they form images.
The word “lens” derives from the Latin word for a lentil bean, the shape of which is similar to a convex lens. However,not all lenses have the same shape. Figure 2.17 shows a variety of different lens shapes. The vocabulary used to describelenses is the same as that used for spherical mirrors: The axis of symmetry of a lens is called the optical axis, where thisaxis intersects the lens surface is called the vertex of the lens, and so forth.


Figure 2.17 Various types of lenses: Note that a converging lens has a thicker “waist,” whereas adiverging lens has a thinner waist.


70 Chapter 2 | Geometric Optics and Image Formation


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




A convex or converging lens is shaped so that all light rays that enter it parallel to its optical axis intersect (or focus)at a single point on the optical axis on the opposite side of the lens, as shown in part (a) of Figure 2.18. Likewise, aconcave or diverging lens is shaped so that all rays that enter it parallel to its optical axis diverge, as shown in part (b).To understand more precisely how a lens manipulates light, look closely at the top ray that goes through the converginglens in part (a). Because the index of refraction of the lens is greater than that of air, Snell’s law tells us that the ray is benttoward the perpendicular to the interface as it enters the lens. Likewise, when the ray exits the lens, it is bent away from theperpendicular. The same reasoning applies to the diverging lenses, as shown in part (b). The overall effect is that light raysare bent toward the optical axis for a converging lens and away from the optical axis for diverging lenses. For a converginglens, the point at which the rays cross is the focal point F of the lens. For a diverging lens, the point from which the raysappear to originate is the (virtual) focal point. The distance from the center of the lens to its focal point is the focal length fof the lens.


Figure 2.18 Rays of light entering (a) a converging lens and (b) a diverging lens, parallel to its axis, converge at its focalpoint F. The distance from the center of the lens to the focal point is the lens’s focal length f. Note that the light rays arebent upon entering and exiting the lens, with the overall effect being to bend the rays toward the optical axis.


A lens is considered to be thin if its thickness t is much less than the radii of curvature of both surfaces, as shown in Figure2.19. In this case, the rays may be considered to bend once at the center of the lens. For the case drawn in the figure, lightray 1 is parallel to the optical axis, so the outgoing ray is bent once at the center of the lens and goes through the focal point.Another important characteristic of thin lenses is that light rays that pass through the center of the lens are undeviated, asshown by light ray 2.


Chapter 2 | Geometric Optics and Image Formation 71




Figure 2.19 In the thin-lens approximation, the thickness d of the lens is much, much less than the radii R1 and R2 of
curvature of the surfaces of the lens. Light rays are considered to bend at the center of the lens, such as light ray 1. Light ray 2passes through the center of the lens and is undeviated in the thin-lens approximation.


As noted in the initial discussion of Snell’s law, the paths of light rays are exactly reversible. This means that the directionof the arrows could be reversed for all of the rays in Figure 2.18. For example, if a point-light source is placed at the focalpoint of a convex lens, as shown in Figure 2.20, parallel light rays emerge from the other side.


Figure 2.20 A small light source, like a light bulb filament,placed at the focal point of a convex lens results in parallel raysof light emerging from the other side. The paths are exactly thereverse of those shown in Figure 2.18 in converging anddiverging lenses. This technique is used in lighthouses andsometimes in traffic lights to produce a directional beam of lightfrom a source that emits light in all directions.
Ray Tracing and Thin Lenses
Ray tracing is the technique of determining or following (tracing) the paths taken by light rays.
Ray tracing for thin lenses is very similar to the technique we used with spherical mirrors. As for mirrors, ray tracing canaccurately describe the operation of a lens. The rules for ray tracing for thin lenses are similar to those of spherical mirrors:


1. A ray entering a converging lens parallel to the optical axis passes through the focal point on the other side of the


72 Chapter 2 | Geometric Optics and Image Formation


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




lens (ray 1 in part (a) of Figure 2.21). A ray entering a diverging lens parallel to the optical axis exits along theline that passes through the focal point on the same side of the lens (ray 1 in part (b) of the figure).
2. A ray passing through the center of either a converging or a diverging lens is not deviated (ray 2 in parts (a) and(b)).
3. For a converging lens, a ray that passes through the focal point exits the lens parallel to the optical axis (ray 3 in part(a)). For a diverging lens, a ray that approaches along the line that passes through the focal point on the oppositeside exits the lens parallel to the axis (ray 3 in part (b)).


Figure 2.21 Thin lenses have the same focal lengths on either side. (a) Parallel light rays entering a converging lens from theright cross at its focal point on the left. (b) Parallel light rays entering a diverging lens from the right seem to come from thefocal point on the right.


Thin lenses work quite well for monochromatic light (i.e., light of a single wavelength). However, for light that containsseveral wavelengths (e.g., white light), the lenses work less well. The problem is that, as we learned in the previouschapter, the index of refraction of a material depends on the wavelength of light. This phenomenon is responsible for manycolorful effects, such as rainbows. Unfortunately, this phenomenon also leads to aberrations in images formed by lenses. Inparticular, because the focal distance of the lens depends on the index of refraction, it also depends on the wavelength of theincident light. This means that light of different wavelengths will focus at different points, resulting is so-called “chromaticaberrations.” In particular, the edges of an image of a white object will become colored and blurred. Special lenses calleddoublets are capable of correcting chromatic aberrations. A doublet is formed by gluing together a converging lens and adiverging lens. The combined doublet lens produces significantly reduced chromatic aberrations.
Image Formation by Thin Lenses
We use ray tracing to investigate different types of images that can be created by a lens. In some circumstances, a lens formsa real image, such as when a movie projector casts an image onto a screen. In other cases, the image is a virtual image,which cannot be projected onto a screen. Where, for example, is the image formed by eyeglasses? We use ray tracing forthin lenses to illustrate how they form images, and then we develop equations to analyze quantitatively the properties ofthin lenses.
Consider an object some distance away from a converging lens, as shown in Figure 2.22. To find the location and size ofthe image, we trace the paths of selected light rays originating from one point on the object, in this case, the tip of the arrow.


Chapter 2 | Geometric Optics and Image Formation 73




The figure shows three rays from many rays that emanate from the tip of the arrow. These three rays can be traced by usingthe ray-tracing rules given above.
• Ray 1 enters the lens parallel to the optical axis and passes through the focal point on the opposite side (rule 1).
• Ray 2 passes through the center of the lens and is not deviated (rule 2).
• Ray 3 passes through the focal point on its way to the lens and exits the lens parallel to the optical axis (rule 3).


The three rays cross at a single point on the opposite side of the lens. Thus, the image of the tip of the arrow is located atthis point. All rays that come from the tip of the arrow and enter the lens are refracted and cross at the point shown.
After locating the image of the tip of the arrow, we need another point of the image to orient the entire image of the arrow.We chose to locate the image base of the arrow, which is on the optical axis. As explained in the section on spherical mirrors,the base will be on the optical axis just above the image of the tip of the arrow (due to the top-bottom symmetry of the lens).Thus, the image spans the optical axis to the (negative) height shown. Rays from another point on the arrow, such as themiddle of the arrow, cross at another common point, thus filling in the rest of the image.
Although three rays are traced in this figure, only two are necessary to locate a point of the image. It is best to trace rays forwhich there are simple ray-tracing rules.


Figure 2.22 Ray tracing is used to locate the image formed by a lens. Rays originatingfrom the same point on the object are traced—the three chosen rays each follow one of therules for ray tracing, so that their paths are easy to determine. The image is located at thepoint where the rays cross. In this case, a real image—one that can be projected on ascreen—is formed.


Several important distances appear in the figure. As for a mirror, we define do to be the object distance, or the distance
of an object from the center of a lens. The image distance di is defined to be the distance of the image from the center of
a lens. The height of the object and the height of the image are indicated by ho and hi , respectively. Images that appear
upright relative to the object have positive heights, and those that are inverted have negative heights. By using the rulesof ray tracing and making a scale drawing with paper and pencil, like that in Figure 2.22, we can accurately describe thelocation and size of an image. But the real benefit of ray tracing is in visualizing how images are formed in a variety ofsituations.
Oblique Parallel Rays and Focal Plane
We have seen that rays parallel to the optical axis are directed to the focal point of a converging lens. In the case of adiverging lens, they come out in a direction such that they appear to be coming from the focal point on the opposite sideof the lens (i.e., the side from which parallel rays enter the lens). What happens to parallel rays that are not parallel tothe optical axis (Figure 2.23)? In the case of a converging lens, these rays do not converge at the focal point. Instead,they come together on another point in the plane called the focal plane. The focal plane contains the focal point and isperpendicular to the optical axis. As shown in the figure, parallel rays focus where the ray through the center of the lenscrosses the focal plane.


74 Chapter 2 | Geometric Optics and Image Formation


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 2.23 Parallel oblique rays focus on a point in a focal plane.
Thin-Lens Equation
Ray tracing allows us to get a qualitative picture of image formation. To obtain numeric information, we derive a pair ofequations from a geometric analysis of ray tracing for thin lenses. These equations, called the thin-lens equation and thelens maker’s equation, allow us to quantitatively analyze thin lenses.
Consider the thick bi-convex lens shown in Figure 2.24. The index of refraction of the surrounding medium is n1 (if the
lens is in air, then n1 = 1.00 ) and that of the lens is n2 . The radii of curvatures of the two sides are R1 and R2 . We wish
to find a relation between the object distance do , the image distance di , and the parameters of the lens.


Figure 2.24 Figure for deriving the lens maker’s equation. Here, t is the thickness of lens, n1 is the index of refraction of the
exterior medium, and n2 is the index of refraction of the lens. We take the limit of t → 0 to obtain the formula for a thin lens.


To derive the thin-lens equation, we consider the image formed by the first refracting surface (i.e., left surface) and then usethis image as the object for the second refracting surface. In the figure, the image from the first refracting surface is Q′ ,
which is formed by extending backwards the rays from inside the lens (these rays result from refraction at the first surface).This is shown by the dashed lines in the figure. Notice that this image is virtual because no rays actually pass through thepoint Q′ . To find the image distance di′ corresponding to the image Q′ , we use Equation 2.11. In this case, the object
distance is do , the image distance is di′ , and the radius of curvature is R1 . Inserting these into Equation 2.3 gives


(2.14)n1
do


+
n2
di′


=
n2 − n1
R1


.


The image is virtual and on the same side as the object, so di′ < 0 and do > 0 . The first surface is convex toward the


Chapter 2 | Geometric Optics and Image Formation 75




object, so R1 > 0 .
To find the object distance for the object Q formed by refraction from the second interface, note that the role of the indicesof refraction n1 and n2 are interchanged in Equation 2.11. In Figure 2.24, the rays originate in the medium with index
n2 , whereas in Figure 2.15, the rays originate in the medium with index n1 . Thus, we must interchange n1 and n2 in
Equation 2.11. In addition, by consulting again Figure 2.24, we see that the object distance is do′ and the image distance
is di . The radius of curvature is R2 Inserting these quantities into Equation 2.11 gives


(2.15)n2
do′


+
n1
di


=
n1 − n2
R2


.


The image is real and on the opposite side from the object, so di > 0 and do′ > 0 . The second surface is convex away from
the object, so R2 < 0 . Equation 2.15 can be simplified by noting that do′ = |di′| + t , where we have taken the absolute
value because di′ is a negative number, whereas both do′ and t are positive. We can dispense with the absolute value if we
negate di′ , which gives do′ = −di′ + t . Inserting this into Equation 2.15 gives


(2.16)n2
−di′ + t


+
n1
di


=
n1 − n2
R2


.


Summing Equation 2.14 and Equation 2.16 gives
(2.17)n1


do
+
n1
di


+
n2
di′


+
n2


−di′ + t
= (n2 − n1)




1
R1


− 1
R2



⎠.


In the thin-lens approximation, we assume that the lens is very thin compared to the first image distance, or t ≪ di′ (or,
equivalently, t ≪ R1 and R2 ). In this case, the third and fourth terms on the left-hand side of Equation 2.17 cancel,
leaving us with


n1
do


+
n1
di


= (n2 − n1)


1
R1


− 1
R2



⎠.


Dividing by n1 gives us finally
(2.18)1


do
+ 1
di


= ⎛⎝
n2
n1


− 1⎞⎠


1
R1


− 1
R2



⎠.


The left-hand side looks suspiciously like the mirror equation that we derived above for spherical mirrors. As done forspherical mirrors, we can use ray tracing and geometry to show that, for a thin lens,


(2.19)1
do


+ 1
di


= 1
f


where f is the focal length of the thin lens (this derivation is left as an exercise). This is the thin-lens equation. The focallength of a thin lens is the same to the left and to the right of the lens. Combining Equation 2.18 and Equation 2.19gives


(2.20)1
f
= ⎛⎝


n2
n1


− 1⎞⎠


1
R1


− 1
R2





which is called the lens maker’s equation. It shows that the focal length of a thin lens depends only of the radii of curvatureand the index of refraction of the lens and that of the surrounding medium. For a lens in air, n1 = 1.0 and n2 ≡ n , so the
lens maker’s equation reduces to


76 Chapter 2 | Geometric Optics and Image Formation


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




(2.21)1
f
= (n − 1)




1
R1


− 1
R2



⎠.


Sign conventions for lenses
To properly use the thin-lens equation, the following sign conventions must be obeyed:


1. di is positive if the image is on the side opposite the object (i.e., real image); otherwise, di is negative (i.e., virtual
image).


2. f is positive for a converging lens and negative for a diverging lens.
3. R is positive for a surface convex toward the object, and negative for a surface concave toward object.


Magnification
By using a finite-size object on the optical axis and ray tracing, you can show that the magnification m of an image is


(2.22)
m ≡


hi
ho


= −
di
do


(where the three lines mean “is defined as”). This is exactly the same equation as we obtained for mirrors (see Equation2.8). If m > 0 , then the image has the same vertical orientation as the object (called an “upright” image). If m < 0 , then
the image has the opposite vertical orientation as the object (called an “inverted” image).
Using the Thin-Lens Equation
The thin-lens equation and the lens maker’s equation are broadly applicable to situations involving thin lenses. We exploremany features of image formation in the following examples.
Consider a thin converging lens. Where does the image form and what type of image is formed as the object approaches thelens from infinity? This may be seen by using the thin-lens equation for a given focal length to plot the image distance as afunction of object distance. In other words, we plot


di =


1
f
− 1
do



−1


for a given value of f. For f = 1 cm , the result is shown in part (a) of Figure 2.25.


Figure 2.25 (a) Image distance for a thin converging lens with f = 1.0 cm as a function of object distance. (b) Same thing
but for a diverging lens with f = −1.0 cm .


An object much farther than the focal length f from the lens should produce an image near the focal plane, because the


Chapter 2 | Geometric Optics and Image Formation 77




second term on the right-hand side of the equation above becomes negligible compared to the first term, so we have di ≈ f .
This can be seen in the plot of part (a) of the figure, which shows that the image distance approaches asymptotically thefocal length of 1 cm for larger object distances. As the object approaches the focal plane, the image distance diverges topositive infinity. This is expected because an object at the focal plane produces parallel rays that form an image at infinity(i.e., very far from the lens). When the object is farther than the focal length from the lens, the image distance is positive, sothe image is real, on the opposite side of the lens from the object, and inverted (because m = −di/do ). When the object is
closer than the focal length from the lens, the image distance becomes negative, which means that the image is virtual, onthe same side of the lens as the object, and upright.
For a thin diverging lens of focal length f = −1.0 cm , a similar plot of image distance vs. object distance is shown in part
(b). In this case, the image distance is negative for all positive object distances, which means that the image is virtual, on thesame side of the lens as the object, and upright. These characteristics may also be seen by ray-tracing diagrams (see Figure2.26).


Figure 2.26 The red dots show the focal points of the lenses. (a) A real, inverted image formed from an object that is fartherthan the focal length from a converging lens. (b) A virtual, upright image formed from an object that is closer than a focal lengthfrom the lens. (c) A virtual, upright image formed from an object that is farther than a focal length from a diverging lens.


To see a concrete example of upright and inverted images, look at Figure 2.27, which shows images formed by converginglenses when the object (the person’s face in this case) is place at different distances from the lens. In part (a) of the figure,the person’s face is farther than one focal length from the lens, so the image is inverted. In part (b), the person’s face iscloser than one focal length from the lens, so the image is upright.


78 Chapter 2 | Geometric Optics and Image Formation


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 2.27 (a) When a converging lens is held farther than one focal length from the man’s face, an inverted image is formed.Note that the image is in focus but the face is not, because the image is much closer to the camera taking this photograph than theface. (b) An upright image of the man’s face is produced when a converging lens is held at less than one focal length from hisface. (credit a: modification of work by “DaMongMan”/Flickr; credit b: modification of work by Casey Fleser)


Work through the following examples to better understand how thin lenses work.
Problem-Solving Strategy: Lenses
Step 1. Determine whether ray tracing, the thin-lens equation, or both would be useful. Even if ray tracing is not used,a careful sketch is always very useful. Write symbols and values on the sketch.
Step 2. Identify what needs to be determined in the problem (identify the unknowns).
Step 3. Make a list of what is given or can be inferred from the problem (identify the knowns).
Step 4. If ray tracing is required, use the ray-tracing rules listed near the beginning of this section.
Step 5. Most quantitative problems require the use of the thin-lens equation and/or the lens maker’s equation. Solvethese for the unknowns and insert the given quantities or use both together to find two unknowns.
Step 7. Check to see if the answer is reasonable. Are the signs correct? Is the sketch or ray tracing consistent with thecalculation?


Example 2.3
Using the Lens Maker’s Equation
Find the radius of curvature of a biconcave lens symmetrically ground from a glass with index of refractive 1.55so that its focal length in air is 20 cm (for a biconcave lens, both surfaces have the same radius of curvature).
Strategy
Use the thin-lens form of the lens maker’s equation:


1
f
= ⎛⎝


n2
n1


− 1⎞⎠


1
R1


− 1
R2





where R1 < 0 and R2 > 0 . Since we are making a symmetric biconcave lens, we have |R1| = |R2| .


Chapter 2 | Geometric Optics and Image Formation 79




Solution
We can determine the radius R of curvature from


1
f
= ⎛⎝


n2
n1


− 1⎞⎠

⎝−


2
R

⎠.


Solving for R and inserting f = −20 cm, n2 = 1.55, and n1 = 1.00 gives
R = −2 f ⎛⎝


n2
n1


− 1⎞⎠ = −2(−20 cm)


1.55
1.00


− 1⎞⎠ = 22 cm.


Example 2.4
Converging Lens and Different Object Distances
Find the location, orientation, and magnification of the image for an 3.0 cm high object at each of the followingpositions in front of a convex lens of focal length 10.0 cm. (a) do = 50.0 cm , (b) do = 5.00 cm , and
(c) do = 20.0 cm .
Strategy
We start with the thin-lens equation 1


di
+ 1
do


= 1
f
. Solve this for the image distance di and insert the given


object distance and focal length.
Solutiona. For do = 50 cm, f = + 10 cm , this gives


di =


1
f
− 1
do



−1


= ⎛⎝
1


10.0 cm
− 1


50.0 cm



−1


= 12.5 cm


The image is positive, so the image, is real, is on the opposite side of the lens from the object, and is 12.6cm from the lens. To find the magnification and orientation of the image, use
m = −


di
do


= −12.5 cm
50.0 cm


= −0.250.


The negative magnification means that the image is inverted. Since |m| < 1 , the image is smaller than
the object. The size of the image is given by


|hi| = |m|ho = (0.250)(3.0 cm) = 0.75 cmb. For do = 5.00 cm, f = + 10.0 cm
di =


1
f
− 1
do



−1


= ⎛⎝
1


10.0 cm
− 1


5.00 cm



−1


= −10.0 cm


The image distance is negative, so the image is virtual, is on the same side of the lens as the object, and is10 cm from the lens. The magnification and orientation of the image are found from
m = −


di
do


= −−10.0 cm
5.00 cm


= + 2.00.


80 Chapter 2 | Geometric Optics and Image Formation


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




The positive magnification means that the image is upright (i.e., it has the same orientation as the object).Since |m| > 0 , the image is larger than the object. The size of the image is
|hi| = |m|ho = (2.00)(3.0 cm) = 6.0 cm.c. For do = 20 cm, f = + 10 cm


di =


1
f
− 1
do



−1


= ⎛⎝
1


10.0 cm
− 1


20.0 cm



−1


= 20.0 cm


The image distance is positive, so the image is real, is on the opposite side of the lens from the object, andis 20.0 cm from the lens. The magnification is
m = −


di
do


= −20.0 cm
20.0 cm


= −1.00.


The negative magnification means that the image is inverted. Since |m| = 1 , the image is the same size
as the object.


When solving problems in geometric optics, we often need to combine ray tracing and the lens equations. The followingexample demonstrates this approach.
Example 2.5


Choosing the Focal Length and Type of Lens
To project an image of a light bulb on a screen 1.50 m away, you need to choose what type of lens to use(converging or diverging) and its focal length (Figure 2.28). The distance between the lens and the lightbulb isfixed at 0.75 m. Also, what is the magnification and orientation of the image?
Strategy
The image must be real, so you choose to use a converging lens. The focal length can be found by using thethin-lens equation and solving for the focal length. The object distance is do = 0.75 m and the image distance is
di = 1.5 m .
Solution
Solve the thin lens for the focal length and insert the desired object and image distances:


1
do


+ 1
di


= 1
f


f =


1
do


+ 1
di





−1


= ⎛⎝
1


0.75 m
+ 1


1.5 m



−1


= 0.50 m


The magnification is
m = −


di
do


= − 1.5 m
0.75 m


= −2.0.


Chapter 2 | Geometric Optics and Image Formation 81




Significance
The minus sign for the magnification means that the image is inverted. The focal length is positive, as expectedfor a converging lens. Ray tracing can be used to check the calculation (see Figure 2.28). As expected, the imageis inverted, is real, and is larger than the object.


Figure 2.28 A light bulb placed 0.75 m from a lens having a 0.50-m focal length produces a realimage on a screen, as discussed in the example. Ray tracing predicts the image location and size.


2.5 | The Eye
Learning Objectives


By the end of this section, you will be able to:
• Understand the basic physics of how images are formed by the human eye
• Recognize several conditions of impaired vision as well as the optics principles for treatingthese conditions


The human eye is perhaps the most interesting and important of all optical instruments. Our eyes perform a vast number offunctions: They allow us to sense direction, movement, colors, and distance. In this section, we explore the geometric opticsof the eye.
Physics of the Eye
The eye is remarkable in how it forms images and in the richness of detail and color it can detect. However, our eyesoften need some correction to reach what is called “normal” vision. Actually, normal vision should be called “ideal” visionbecause nearly one-half of the human population requires some sort of eyesight correction, so requiring glasses is by nomeans “abnormal.” Image formation by our eyes and common vision correction can be analyzed with the optics discussedearlier in this chapter.
Figure 2.29 shows the basic anatomy of the eye. The cornea and lens form a system that, to a good approximation, acts asa single thin lens. For clear vision, a real image must be projected onto the light-sensitive retina, which lies a fixed distancefrom the lens. The flexible lens of the eye allows it to adjust the radius of curvature of the lens to produce an image on theretina for objects at different distances. The center of the image falls on the fovea, which has the greatest density of lightreceptors and the greatest acuity (sharpness) in the visual field. The variable opening (i.e., the pupil) of the eye, along with
chemical adaptation, allows the eye to detect light intensities from the lowest observable to 1010 times greater (without
damage). This is an incredible range of detection. Processing of visual nerve impulses begins with interconnections in theretina and continues in the brain. The optic nerve conveys the signals received by the eye to the brain.


82 Chapter 2 | Geometric Optics and Image Formation


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 2.29 The cornea and lens of the eye act together to form a real image on thelight-sensing retina, which has its densest concentration of receptors in the fovea and ablind spot over the optic nerve. The radius of curvature of the lens of an eye isadjustable to form an image on the retina for different object distances. Layers oftissues with varying indices of refraction in the lens are shown here. However, theyhave been omitted from other pictures for clarity.


The indices of refraction in the eye are crucial to its ability to form images. Table 2.1 lists the indices of refraction relevantto the eye. The biggest change in the index of refraction, which is where the light rays are most bent, occurs at the air-cornea interface rather than at the aqueous humor-lens interface. The ray diagram in Figure 2.30 shows image formationby the cornea and lens of the eye. The cornea, which is itself a converging lens with a focal length of approximately 2.3cm, provides most of the focusing power of the eye. The lens, which is a converging lens with a focal length of about 6.4cm, provides the finer focus needed to produce a clear image on the retina. The cornea and lens can be treated as a singlethin lens, even though the light rays pass through several layers of material (such as cornea, aqueous humor, several layersin the lens, and vitreous humor), changing direction at each interface. The image formed is much like the one produced bya single convex lens (i.e., a real, inverted image). Although images formed in the eye are inverted, the brain inverts themonce more to make them seem upright.
Material Index of Refraction
Water 1.33
Air 1.0
Cornea 1.38
Aqueous humor 1.34
Lens 1.41*
Vitreous humor 1.34


Table 2.1 Refractive Indices Relevant to theEye *This is an average value. The actualindex of refraction varies throughout the lensand is greatest in center of the lens.


Chapter 2 | Geometric Optics and Image Formation 83




Figure 2.30 In the human eye, an image forms on the retina. Rays from the top and bottom of theobject are traced to show how a real, inverted image is produced on the retina. The distance to theobject is not to scale.


As noted, the image must fall precisely on the retina to produce clear vision—that is, the image distance di must equal
the lens-to-retina distance. Because the lens-to-retina distance does not change, the image distance di must be the same
for objects at all distances. The ciliary muscles adjust the shape of the eye lens for focusing on nearby or far objects.By changing the shape of the eye lens, the eye changes the focal length of the lens. This mechanism of the eye is calledaccommodation.
The nearest point an object can be placed so that the eye can form a clear image on the retina is called the near point ofthe eye. Similarly, the far point is the farthest distance at which an object is clearly visible. A person with normal visioncan see objects clearly at distances ranging from 25 cm to essentially infinity. The near point increases with age, becomingseveral meters for some older people. In this text, we consider the near point to be 25 cm.
We can use the thin-lens equations to quantitatively examine image formation by the eye. First, we define the optical powerof a lens as


(2.23)P = 1
f


with the focal length f given in meters. The units of optical power are called “diopters” (D). That is, 1 D = 1m, or 1 m−1 .
Optometrists prescribe common eyeglasses and contact lenses in units of diopters. With this definition of optical power, wecan rewrite the thin-lens equations as


(2.24)P = 1
do


+ 1
di
.


Working with optical power is convenient because, for two or more lenses close together, the effective optical power of thelens system is approximately the sum of the optical power of the individual lenses:


(2.25)Ptotal = Plens 1 + Plens 2 + Plens 3 +⋯


84 Chapter 2 | Geometric Optics and Image Formation


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Example 2.6
Effective Focal Length of the Eye
The cornea and eye lens have focal lengths of 2.3 and 6.4 cm, respectively. Find the net focal length and opticalpower of the eye.
Strategy
The optical powers of the closely spaced lenses add, so Peye = Pcornea + Plens .
Solution
Writing the equation for power in terms of the focal lengths gives


1
feye


= 1
fcornea


+ 1
flens


= 1
2.3 cm


+ 1
6.4 cm


.


Hence, the focal length of the eye (cornea and lens together) is
feye = 1.69 cm.


The optical power of the eye is
Peye = 1feye


= 1
0.0169 m


= 59 D.


For clear vision, the image distance di must equal the lens-to-retina distance. Normal vision is possible for objects at
distances do = 25 cm to infinity. The following example shows how to calculate the image distance for an object placed
at the near point of the eye.
Example 2.7


Image of an object placed at the near point
The net focal length of a particular human eye is 1.7 cm. An object is placed at the near point of the eye. How farbehind the lens is a focused image formed?
Strategy
The near point is 25 cm from the eye, so the object distance is do = 25 cm . We determine the image distance
from the lens equation:


1
di


= 1
f
− 1
do


.


Solution
di =


1
f
− 1
do



−1


= ⎛⎝
1


1.7 cm
− 1


25 cm



−1


= 1.8 cm


Therefore, the image is formed 1.8 cm behind the lens.
Significance
From the magnification formula, we find m = −1.8 cm


25 cm
= −0.073 . Since m < 0 , the image is inverted in


orientation with respect to the object. From the absolute value of m we see that the image is much smaller thanthe object; in fact, it is only 7% of the size of the object.


Chapter 2 | Geometric Optics and Image Formation 85




Vision Correction
The need for some type of vision correction is very common. Typical vision defects are easy to understand with geometricoptics, and some are simple to correct. Figure 2.31 illustrates two common vision defects. Nearsightedness, or myopia,is the ability to see near objects, whereas distant objects are blurry. The eye overconverges the nearly parallel rays from adistant object, and the rays cross in front of the retina. More divergent rays from a close object are converged on the retinafor a clear image. The distance to the farthest object that can be seen clearly is called the far point of the eye (normallythe far point is at infinity). Farsightedness, or hyperopia, is the ability to see far objects clearly, whereas near objects areblurry. A farsighted eye does not sufficiently converge the rays from a near object to make the rays meet on the retina.


Figure 2.31 (a) The nearsighted (myopic) eye converges rays from a distant object in front of the retina, so they havediverged when they strike the retina, producing a blurry image. An eye lens that is too powerful can causenearsightedness, or the eye may be too long. (b) The farsighted (hyperopic) eye is unable to converge the rays from aclose object on the retina, producing blurry near-field vision. An eye lens with insufficient optical power or an eye that istoo short can cause farsightedness.


Since the nearsighted eye overconverges light rays, the correction for nearsightedness consists of placing a divergingeyeglass lens in front of the eye, as shown in Figure 2.32. This reduces the optical power of an eye that is too powerful(recall that the focal length of a diverging lens is negative, so its optical power is negative). Another way to understand thiscorrection is that a diverging lens will cause the incoming rays to diverge more to compensate for the excessive convergencecaused by the lens system of the eye. The image produced by the diverging eyeglass lens serves as the (optical) object forthe eye, and because the eye cannot focus on objects beyond its far point, the diverging lens must form an image of distant(physical) objects at a point that is closer than the far point.


86 Chapter 2 | Geometric Optics and Image Formation


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 2.32 Correction of nearsightedness requires a diverging lens that compensates foroverconvergence by the eye. The diverging lens produces an image closer to the eye than thephysical object. This image serves as the optical object for the eye, and the nearsighted person cansee it clearly because it is closer than their far point.


Example 2.8
Correcting Nearsightedness
What optical power of eyeglass lens is needed to correct the vision of a nearsighted person whose far point is 30.0cm? Assume the corrective lens is fixed 1.50 cm away from the eye.
Strategy
You want this nearsighted person to be able to see distant objects clearly, which means that the eyeglass lensmust produce an image 30.0 cm from the eye for an object at infinity. An image 30.0 cm from the eye willbe 30.0 cm − 1.50 cm = 28.5 cm from the eyeglass lens. Therefore, we must have di = −28.5 cm when
do = ∞ . The image distance is negative because it is on the same side of the eyeglass lens as the object.
Solution
Since di and do are known, we can find the optical power of the eyeglass lens by using Equation 2.24:


P = 1
do


+ 1
di


= 1∞ +
1


−0.285 m
= −3.51D.


Significance
The negative optical power indicates a diverging (or concave) lens, as expected. If you examine eyeglasses fornearsighted people, you will find the lenses are thinnest in the center. Additionally, if you examine a prescriptionfor eyeglasses for nearsighted people, you will find that the prescribed optical power is negative and given inunits of diopters.


Correcting farsightedness consists simply of using the opposite type of lens as for nearsightedness (i.e., a converging lens),


Chapter 2 | Geometric Optics and Image Formation 87




as shown in Figure 2.33.
Such a lens will produce an image of physical objects that are closer than the near point at a distance that is between the nearpoint and the far point, so that the person can see the image clearly. To determine the optical power needed for correction,you must therefore know the person’s near point, as explained in Example 2.9.


Figure 2.33 Correction of farsightedness uses a converging lens that compensates for theunderconvergence by the eye. The converging lens produces an image farther from the eye thanthe object, so that the farsighted person can see it clearly.


Example 2.9
Correcting Farsightedness
What optical power of eyeglass lens is needed to allow a farsighted person, whose near point is 1.00 m, to see anobject clearly that is 25.0 cm from the eye? Assume the corrective lens is fixed 1.5 cm from the eye.
Strategy
When an object is 25.0 cm from the person’s eyes, the eyeglass lens must produce an image 1.00 m away (the nearpoint), so that the person can see it clearly. An image 1.00 m from the eye will be 100 cm − 1.5 cm = 98.5 cm
from the eyeglass lens because the eyeglass lens is 1.5 cm from the eye. Therefore, di = −98.5 cm , where
the minus sign indicates that the image is on the same side of the lens as the object. The object is
25.0 cm − 1.5 cm = 23.5 cm from the eyeglass lens, so do = 23.5 cm .
Solution
Since di and do are known, we can find the optical power of the eyeglass lens by using Equation 2.24:


P = 1
do


+ 1
di


= 1
0.235 m


+ 1
−0.985 m


= + 3.24 D.


Significance
The positive optical power indicates a converging (convex) lens, as expected. If you examine eyeglasses of


88 Chapter 2 | Geometric Optics and Image Formation


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




farsighted people, you will find the lenses to be thickest in the center. In addition, prescription eyeglasses forfarsighted people have a prescribed optical power that is positive.


2.6 | The Camera
Learning Objectives


By the end of this section, you will be able to:
• Describe the optics of a camera
• Characterize the image created by a camera


Cameras are very common in our everyday life. Between 1825 and 1827, French inventor Nicéphore Niépce successfullyphotographed an image created by a primitive camera. Since then, enormous progress has been achieved in the design ofcameras and camera-based detectors.
Initially, photographs were recorded by using the light-sensitive reaction of silver-based compounds such as silver chlorideor silver bromide. Silver-based photographic paper was in common use until the advent of digital photography in the 1980s,which is intimately connected to charge-coupled device (CCD) detectors. In a nutshell, a CCD is a semiconductor chipthat records images as a matrix of tiny pixels, each pixel located in a “bin” in the surface. Each pixel is capable of detectingthe intensity of light impinging on it. Color is brought into play by putting red-, blue-, and green-colored filters over thepixels, resulting in colored digital images (Figure 2.34). At its best resolution, one CCD pixel corresponds to one pixel ofthe image. To reduce the resolution and decrease the size of the file, we can “bin” several CCD pixels into one, resulting ina smaller but “pixelated” image.


Figure 2.34 A charge-coupled device (CCD) converts light signals into electronic signals, enabling electronic processingand storage of visual images. This is the basis for electronic imaging in all digital cameras, from cell phones to moviecameras. (credit left: modification of work by Bruce Turner)


Clearly, electronics is a big part of a digital camera; however, the underlying physics is basic optics. As a matter of fact, theoptics of a camera are pretty much the same as those of a single lens with an object distance that is significantly larger thanthe lens’s focal distance (Figure 2.35).


Chapter 2 | Geometric Optics and Image Formation 89




Figure 2.35 Modern digital cameras have several lenses to produce a clear image with minimal aberration anduse red, blue, and green filters to produce a color image.


For instance, let us consider the camera in a smartphone. An average smartphone camera is equipped with a stationary wide-angle lens with a focal length of about 4–5 mm. (This focal length is about equal to the thickness of the phone.) The imagecreated by the lens is focused on the CCD detector mounted at the opposite side of the phone. In a cell phone, the lens andthe CCD cannot move relative to each other. So how do we make sure that both the images of a distant and a close objectare in focus?
Recall that a human eye can accommodate for distant and close images by changing its focal distance. A cell phone cameracannot do that because the distance from the lens to the detector is fixed. Here is where the small focal distance becomesimportant. Let us assume we have a camera with a 5-mm focal distance. What is the image distance for a selfie? The objectdistance for a selfie (the length of the hand holding the phone) is about 50 cm. Using the thin-lens equation, we can write


1
5 mm


= 1
500 mm


+ 1
di


We then obtain the image distance:
1
di


= 1
5 mm


− 1
500 mm


Note that the object distance is 100 times larger than the focal distance. We can clearly see that the 1/(500 mm) term issignificantly smaller than 1/(5 mm), which means that the image distance is pretty much equal to the lens’s focal length. Anactual calculation gives us the image distance di = 5.05 mm . This value is extremely close to the lens’s focal distance.
Now let us consider the case of a distant object. Let us say that we would like to take a picture of a person standing about5 m from us. Using the thin-lens equation again, we obtain the image distance of 5.005 mm. The farther the object is fromthe lens, the closer the image distance is to the focal distance. At the limiting case of an infinitely distant object, we obtainthe image distance exactly equal to the focal distance of the lens.
As you can see, the difference between the image distance for a selfie and the image distance for a distant object is justabout 0.05 mm or 50 microns. Even a short object distance such as the length of your hand is two orders of magnitudelarger than the lens’s focal length, resulting in minute variations of the image distance. (The 50-micron difference is smallerthan the thickness of an average sheet of paper.) Such a small difference can be easily accommodated by the same detector,positioned at the focal distance of the lens. Image analysis software can help improve image quality.
Conventional point-and-shoot cameras often use a movable lens to change the lens-to-image distance. Complex lenses of


90 Chapter 2 | Geometric Optics and Image Formation


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




the more expensive mirror reflex cameras allow for superb quality photographic images. The optics of these camera lensesis beyond the scope of this textbook.
2.7 | The Simple Magnifier


Learning Objectives
By the end of this section, you will be able to:
• Understand the optics of a simple magnifier
• Characterize the image created by a simple magnifier


The apparent size of an object perceived by the eye depends on the angle the object subtends from the eye. As shown inFigure 2.36, the object at A subtends a larger angle from the eye than when it is position at point B. Thus, the object at Aforms a larger image on the retina (see OA′ ) than when it is positioned at B (see OB′ ). Thus, objects that subtend large
angles from the eye appear larger because they form larger images on the retina.


Figure 2.36 Size perceived by an eye is determined by the angle subtended by the object. Animage formed on the retina by an object at A is larger than an image formed on the retina by thesame object positioned at B (compared image heights OA′ to OB′ ).


We have seen that, when an object is placed within a focal length of a convex lens, its image is virtual, upright, and largerthan the object (see part (b) of Figure 2.26). Thus, when such an image produced by a convex lens serves as the object forthe eye, as shown in Figure 2.37, the image on the retina is enlarged, because the image produced by the lens subtends alarger angle in the eye than does the object. A convex lens used for this purpose is called a magnifying glass or a simplemagnifier.


Chapter 2 | Geometric Optics and Image Formation 91




Figure 2.37 The simple magnifier is a convex lens used to produce an enlarged image of an object on the retina. (a) With noconvex lens, the object subtends an angle θobject from the eye. (b) With the convex lens in place, the image produced by the
convex lens subtends an angle θimage from the eye, with θimage > θobject . Thus, the image on the retina is larger with the
convex lens in place.


To account for the magnification of a magnifying lens, we compare the angle subtended by the image (created by the lens)with the angle subtended by the object (viewed with no lens), as shown in Figure 2.37. We assume that the object issituated at the near point of the eye, because this is the object distance at which the unaided eye can form the largest imageon the retina. We will compare the magnified images created by a lens with this maximum image size for the unaided eye.The magnification of an image when observed by the eye is the angular magnification M, which is defined by the ratio ofthe angle θimage subtended by the image to the angle θobject subtended by the object:


(2.26)
M =


θimage
θobject


.


Consider the situation shown in Figure 2.37. The magnifying lens is held a distance ℓ from the eye, and the image
produced by the magnifier forms a distance L from the eye. We want to calculate the angular magnification for any arbitraryL and ℓ . In the small-angle approximation, the angular size θimage of the image is hi/L . The angular size θobject of the
object at the near point is θobject = ho/25 cm . The angular magnification is then


92 Chapter 2 | Geometric Optics and Image Formation


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




(2.27)
M =


θimage
θobject


=
hi (25 cm)


Lho
.


Using Equation 2.8 for linear magnification
m = −


di
do


=
hi
ho


and the thin-lens equation
1
do


+ 1
di


= 1
f


in Equation 2.27, we arrive at the following expression for the angular magnification of a magnifying lens:
(2.28)


M =

⎝−


di
do




25 cm
L



= −di


1
f
− 1
di






25 cm
L



=

⎝1 −


di
f




25 cm
L



From part (b) of the figure, we see that the absolute value of the image distance is |di| = L − ℓ . Note that di < 0 because
the image is virtual, so we can dispense with the absolute value by explicitly inserting the minus sign: −di = L − ℓ .
Inserting this into Equation 2.28 gives us the final equation for the angular magnification of a magnifying lens:


(2.29)
M = ⎛⎝


25 cm
L



⎝1 +


L − ℓ
f

⎠.


Note that all the quantities in this equation have to be expressed in centimeters. Often, we want the image to be at the near-point distance ( L = 25 cm ) to get maximum magnification, and we hold the magnifying lens close to the eye ( ℓ = 0 ). In
this case, Equation 2.29 gives


(2.30)M = 1 + 25 cm
f


which shows that the greatest magnification occurs for the lens with the shortest focal length. In addition, when the image isat the near-point distance and the lens is held close to the eye (ℓ = 0) , then L = di = 25 cm and Equation 2.27 becomes
(2.31)


M =
hi
ho


= m


where m is the linear magnification (Equation 2.32) derived for spherical mirrors and thin lenses. Another useful situationis when the image is at infinity (L = ∞) . Equation 2.29 then takes the form
(2.32)M(L = ∞) = 25 cm


f
.


The resulting magnification is simply the ratio of the near-point distance to the focal length of the magnifying lens, soa lens with a shorter focal length gives a stronger magnification. Although this magnification is smaller by 1 than themagnification obtained with the image at the near point, it provides for the most comfortable viewing conditions, becausethe eye is relaxed when viewing a distant object.
By comparing Equation 2.29 with Equation 2.32, we see that the range of angular magnification of a given converginglens is


Chapter 2 | Geometric Optics and Image Formation 93




(2.33)25 cm
f


≤ M ≤ 1 + 25 cm
f


.


Example 2.10
Magnifying a Diamond
A jeweler wishes to inspect a 3.0-mm-diameter diamond with a magnifier. The diamond is held at the jeweler’snear point (25 cm), and the jeweler holds the magnifying lens close to his eye.
(a) What should the focal length of the magnifying lens be to see a 15-mm-diameter image of the diamond?
(b) What should the focal length of the magnifying lens be to obtain 10 × magnification?
Strategy
We need to determine the requisite magnification of the magnifier. Because the jeweler holds the magnifying lensclose to his eye, we can use Equation 2.30 to find the focal length of the magnifying lens.
Solutiona. The required linear magnification is the ratio of the desired image diameter to the diamond’s actualdiameter (Equation 2.32). Because the jeweler holds the magnifying lens close to his eye and the imageforms at his near point, the linear magnification is the same as the angular magnification, so


M = m =
hi
ho


= 15 mm
3.0 mm


= 5.0.


The focal length f of the magnifying lens may be calculated by solving Equation 2.30 for f, which gives
M = 1 + 25 cm


f


f = 25 cm
M − 1


= 25 cm
5.0 − 1


= 6.3 cm


b. To get an image magnified by a factor of ten, we again solve Equation 2.30 for f, but this time we use
M = 10 . The result is


f = 25 cm
M − 1


= 25 cm
10 − 1


= 2.8 cm.


Significance
Note that a greater magnification is achieved by using a lens with a smaller focal length. We thus need to use alens with radii of curvature that are less than a few centimeters and hold it very close to our eye. This is not veryconvenient. A compound microscope, explored in the following section, can overcome this drawback.


2.8 | Microscopes and Telescopes
Learning Objectives


By the end of this section, you will be able to:
• Explain the physics behind the operation of microscopes and telescopes
• Describe the image created by these instruments and calculate their magnifications


Microscopes and telescopes are major instruments that have contributed hugely to our current understanding of the micro-and macroscopic worlds. The invention of these devices led to numerous discoveries in disciplines such as physics,astronomy, and biology, to name a few. In this section, we explain the basic physics that make these instruments work.


94 Chapter 2 | Geometric Optics and Image Formation


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Microscopes
Although the eye is marvelous in its ability to see objects large and small, it obviously is limited in the smallest details itcan detect. The desire to see beyond what is possible with the naked eye led to the use of optical instruments. We haveseen that a simple convex lens can create a magnified image, but it is hard to get large magnification with such a lens. Amagnification greater than 5 × is difficult without distorting the image. To get higher magnification, we can combine the
simple magnifying glass with one or more additional lenses. In this section, we examine microscopes that enlarge the detailsthat we cannot see with the naked eye.
Microscopes were first developed in the early 1600s by eyeglass makers in The Netherlands and Denmark. The simplestcompound microscope is constructed from two convex lenses (Figure 2.38). The objective lens is a convex lens of shortfocal length (i.e., high power) with typical magnification from 5 × to 100 × . The eyepiece, also referred to as the ocular,
is a convex lens of longer focal length.
The purpose of a microscope is to create magnified images of small objects, and both lenses contribute to the finalmagnification. Also, the final enlarged image is produced sufficiently far from the observer to be easily viewed, since theeye cannot focus on objects or images that are too close (i.e., closer than the near point of the eye).


Figure 2.38 A compound microscope is composed of two lenses: an objective and an eyepiece. The objective forms the firstimage, which is larger than the object. This first image is inside the focal length of the eyepiece and serves as the object for theeyepiece. The eyepiece forms final image that is further magnified.


To see how the microscope in Figure 2.38 forms an image, consider its two lenses in succession. The object is just beyond
the focal length f obj of the objective lens, producing a real, inverted image that is larger than the object. This first image
serves as the object for the second lens, or eyepiece. The eyepiece is positioned so that the first image is within its focal
length f eye , so that it can further magnify the image. In a sense, it acts as a magnifying glass that magnifies the intermediate
image produced by the objective. The image produced by the eyepiece is a magnified virtual image. The final image remainsinverted but is farther from the observer than the object, making it easy to view.
The eye views the virtual image created by the eyepiece, which serves as the object for the lens in the eye. The virtual imageformed by the eyepiece is well outside the focal length of the eye, so the eye forms a real image on the retina.
The magnification of the microscope is the product of the linear magnification mobj by the objective and the angular
magnification Meye by the eyepiece. These are given by


Chapter 2 | Geometric Optics and Image Formation 95




m
obj


= −
di
obj


do
obj


≈ −
di
obj


f
obj


(linear magnification y objective)


M
eye


= 1 + 25 cm
f
eye (angular magnification y eyepiece)


Here, f obj and f eye are the focal lengths of the objective and the eyepiece, respectively. We assume that the final image
is formed at the near point of the eye, providing the largest magnification. Note that the angular magnification of theeyepiece is the same as obtained earlier for the simple magnifying glass. This should not be surprising, because the eyepieceis essentially a magnifying glass, and the same physics applies here. The net magnification Mnet of the compound
microscope is the product of the linear magnification of the objective and the angular magnification of the eyepiece:


(2.34)
Mnet = m


obj
M


eye
= −


di
obj ⎛
⎝ f


eye
+ 25 cm⎞⎠


f
obj


f
eye


.


Example 2.11
Microscope Magnification
Calculate the magnification of an object placed 6.20 mm from a compound microscope that has a 6.00 mm-focallength objective and a 50.0 mm-focal length eyepiece. The objective and eyepiece are separated by 23.0 cm.
Strategy
This situation is similar to that shown in Figure 2.38. To find the overall magnification, we must know the linearmagnification of the objective and the angular magnification of the eyepiece. We can use Equation 2.34, but we
need to use the thin-lens equation to find the image distance diobj of the objective.
Solution
Solving the thin-lens equation for diobj gives


di
obj


=



⎜ 1
f
obj


− 1


do
obj






−1


= ⎛⎝
1


6.00 mm
− 1


6.20 mm



−1
= 186 mm = 18.6 cm


Inserting this result into Equation 2.34 along with the known values f obj = 6.20 mm = 0.620 cm and
f
eye


= 50.0 mm = 5.00 cm gives
Mnet = −


di
obj


( f
eye


+ 25 cm)


f
obj


f
eye


= −(18.6 cm)(5.00 cm + 25 cm)
(0.620 cm)(5.00 cm)


= −180


Significance
Both the objective and the eyepiece contribute to the overall magnification, which is large and negative, consistentwith Figure 2.38, where the image is seen to be large and inverted. In this case, the image is virtual and inverted,which cannot happen for a single element (see Figure 2.26).


96 Chapter 2 | Geometric Optics and Image Formation


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 2.39 A compound microscope with the image created at infinity.


We now calculate the magnifying power of a microscope when the image is at infinity, as shown in Figure 2.39, becausethis makes for the most relaxed viewing. The magnifying power of the microscope is the product of linear magnification
m


obj of the objective and the angular magnification Meye of the eyepiece. We know that mobj = −diobj/doobj and from
the thin-lens equation we obtain


(2.35)
m


obj
= −


di
obj


do
obj


= 1 −
di
obj


f
obj


=
f
obj


− di
obj


f
obj


.


If the final image is at infinity, then the image created by the objective must be located at the focal point of the eyepiece. Thismay be seen by considering the thin-lens equation with di = ∞ or by recalling that rays that pass through the focal point
exit the lens parallel to each other, which is equivalent to focusing at infinity. For many microscopes, the distance betweenthe image-side focal point of the objective and the object-side focal point of the eyepiece is standardized at L = 16 cm .
This distance is called the tube length of the microscope. From Figure 2.39, we see that L = f obj − diobj . Inserting this
into Equation 2.35 gives


(2.36)mobj = L
f
obj


= 16 cm


f
obj


.


We now need to calculate the angular magnification of the eyepiece with the image at infinity. To do so, we take the ratioof the angle θimage subtended by the image to the angle θobject subtended by the object at the near point of the eye
(this is the closest that the unaided eye can view the object, and thus this is the position where the object will form thelargest image on the retina of the unaided eye). Using Figure 2.39 and working in the small-angle approximation, we have
θimage ≈ hi


obj
/ f


eye and θobject ≈ hiobj/25 cm , where hiobj is the height of the image formed by the objective, which is
the object of the eyepiece. Thus, the angular magnification of the eyepiece is


(2.37)
M


eye
=


θimage
θobject


=
hi
obj


f
eye


25 cm


hi
obj


= 25 cm
f
eye .


The net magnifying power of the compound microscope with the image at infinity is therefore


Chapter 2 | Geometric Optics and Image Formation 97




(2.38)Mnet = mobjMeye = −(16 cm)(25 cm)
f
obj


f
eye


.


The focal distances must be in centimeters. The minus sign indicates that the final image is inverted. Note that the onlyvariables in the equation are the focal distances of the eyepiece and the objective, which makes this equation particularlyuseful.
Telescopes
Telescopes are meant for viewing distant objects and produce an image that is larger than the image produced in theunaided eye. Telescopes gather far more light than the eye, allowing dim objects to be observed with greater magnificationand better resolution. Telescopes were invented around 1600, and Galileo was the first to use them to study the heavens,with monumental consequences. He observed the moons of Jupiter, the craters and mountains on the moon, the details ofsunspots, and the fact that the Milky Way is composed of a vast number of individual stars.


Figure 2.40 (a) Galileo made telescopes with a convex objective and a concave eyepiece. These produce anupright image and are used in spyglasses. (b) Most simple refracting telescopes have two convex lenses. Theobjective forms a real, inverted image at (or just within) the focal plane of the eyepiece. This image serves as theobject for the eyepiece. The eyepiece forms a virtual, inverted image that is magnified.


Part (a) of Figure 2.40 shows a refracting telescope made of two lenses. The first lens, called the objective, forms a real


98 Chapter 2 | Geometric Optics and Image Formation


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




image within the focal length of the second lens, which is called the eyepiece. The image of the objective lens serves as theobject for the eyepiece, which forms a magnified virtual image that is observed by the eye. This design is what Galileo usedto observe the heavens.
Although the arrangement of the lenses in a refracting telescope looks similar to that in a microscope, there are importantdifferences. In a telescope, the real object is far away and the intermediate image is smaller than the object. In a microscope,the real object is very close and the intermediate image is larger than the object. In both the telescope and the microscope,the eyepiece magnifies the intermediate image; in the telescope, however, this is the only magnification.
The most common two-lens telescope is shown in part (b) of the figure. The object is so far from the telescope that it
is essentially at infinity compared with the focal lengths of the lenses (doobj ≈ ∞) , so the incoming rays are essentially
parallel and focus on the focal plane. Thus, the first image is produced at diobj = f obj , as shown in the figure, and is not
large compared with what you might see by looking directly at the object. However, the eyepiece of the telescope eyepiece(like the microscope eyepiece) allows you to get nearer than your near point to this first image and so magnifies it (becauseyou are near to it, it subtends a larger angle from your eye and so forms a larger image on your retina). As for a simplemagnifier, the angular magnification of a telescope is the ratio of the angle subtended by the image [ θimage in part (b)] to
the angle subtended by the real object [ θobject in part (b)]:


(2.39)
M =


θimage
θobject


.


To obtain an expression for the magnification that involves only the lens parameters, note that the focal plane of theobjective lens lies very close to the focal plan of the eyepiece. If we assume that these planes are superposed, we have thesituation shown in Figure 2.41.


Figure 2.41 The focal plane of the objective lens of a telescope is very near to the focal plane of the eyepiece. The angle
θimage subtended by the image viewed through the eyepiece is larger than the angle θobject subtended by the object when
viewed with the unaided eye.


We further assume that the angles θobject and θimage are small, so that the small-angle approximation holds ( tan θ ≈ θ ).
If the image formed at the focal plane has height h, then


θobject ≈ tan θobject =
h


f
obj


θimage ≈ tan θimage =
−h
f
eye


Chapter 2 | Geometric Optics and Image Formation 99




where the minus sign is introduced because the height is negative if we measure both angles in the counterclockwisedirection. Inserting these expressions into Equation 2.39 gives
(2.40)


M =
−hi
f
eye


f
obj


hi
= −


f
obj


f
eye .


Thus, to obtain the greatest angular magnification, it is best to have an objective with a long focal length and an eyepiecewith a short focal length. The greater the angular magnification M, the larger an object will appear when viewed through atelescope, making more details visible. Limits to observable details are imposed by many factors, including lens quality andatmospheric disturbance. Typical eyepieces have focal lengths of 2.5 cm or 1.25 cm. If the objective of the telescope has afocal length of 1 meter, then these eyepieces result in magnifications of 40 × and 80 × , respectively. Thus, the angular
magnifications make the image appear 40 times or 80 times closer than the real object.
The minus sign in the magnification indicates the image is inverted, which is unimportant for observing the stars but is a realproblem for other applications, such as telescopes on ships or telescopic gun sights. If an upright image is needed, Galileo’sarrangement in part (a) of Figure 2.40 can be used. But a more common arrangement is to use a third convex lens as aneyepiece, increasing the distance between the first two and inverting the image once again, as seen in Figure 2.42.


Figure 2.42 This arrangement of three lenses in a telescope produces an upright final image. The first two lenses are farenough apart that the second lens inverts the image of the first. The third lens acts as a magnifier and keeps the image uprightand in a location that is easy to view.


The largest refracting telescope in the world is the 40-inch diameter Yerkes telescope located at Lake Geneva, Wisconsin(Figure 2.43), and operated by the University of Chicago.
It is very difficult and expensive to build large refracting telescopes. You need large defect-free lenses, which in itself is atechnically demanding task. A refracting telescope basically looks like a tube with a support structure to rotate it in differentdirections. A refracting telescope suffers from several problems. The aberration of lenses causes the image to be blurred.Also, as the lenses become thicker for larger lenses, more light is absorbed, making faint stars more difficult to observe.Large lenses are also very heavy and deform under their own weight. Some of these problems with refracting telescopesare addressed by avoiding refraction for collecting light and instead using a curved mirror in its place, as devised by IsaacNewton. These telescopes are called reflecting telescopes.


100 Chapter 2 | Geometric Optics and Image Formation


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 2.43 In 1897, the Yerkes Observatory in Wisconsin (USA) builta large refracting telescope with an objective lens that is 40 inches indiameter and has a tube length of 62 feet. (credit: Yerkes Observatory,University of Chicago)
Reflecting Telescopes
Isaac Newton designed the first reflecting telescope around 1670 to solve the problem of chromatic aberration that happensin all refracting telescopes. In chromatic aberration, light of different colors refracts by slightly different amounts in the lens.As a result, a rainbow appears around the image and the image appears blurred. In the reflecting telescope, light rays from adistant source fall upon the surface of a concave mirror fixed at the bottom end of the tube. The use of a mirror instead of alens eliminates chromatic aberration. The concave mirror focuses the rays on its focal plane. The design problem is how toobserve the focused image. Newton used a design in which the focused light from the concave mirror was reflected to oneside of the tube into an eyepiece [part (a) of Figure 2.44]. This arrangement is common in many amateur telescopes and iscalled the Newtonian design.
Some telescopes reflect the light back toward the middle of the concave mirror using a convex mirror. In this arrangement,the light-gathering concave mirror has a hole in the middle [part (b) of the figure]. The light then is incident on an eyepiecelens. This arrangement of the objective and eyepiece is called the Cassegrain design. Most big telescopes, including theHubble space telescope, are of this design. Other arrangements are also possible. In some telescopes, a light detector isplaced right at the spot where light is focused by the curved mirror.


Figure 2.44 Reflecting telescopes: (a) In the Newtonian design, the eyepiece is located at the side of the telescope; (b) in theCassegrain design, the eyepiece is located past a hole in the primary mirror.


Most astronomical research telescopes are now of the reflecting type. One of the earliest large telescopes of this kind isthe Hale 200-inch (or 5-meter) telescope built on Mount Palomar in southern California, which has a 200 inch-diametermirror. One of the largest telescopes in the world is the 10-meter Keck telescope at the Keck Observatory on the summit of


Chapter 2 | Geometric Optics and Image Formation 101




the dormant Mauna Kea volcano in Hawaii. The Keck Observatory operates two 10-meter telescopes. Each is not a singlemirror, but is instead made up of 36 hexagonal mirrors. Furthermore, the two telescopes on the Keck can work together,which increases their power to an effective 85-meter mirror. The Hubble telescope (Figure 2.45) is another large reflectingtelescope with a 2.4 meter-diameter primary mirror. The Hubble was put into orbit around Earth in 1990.


Figure 2.45 The Hubble space telescope as seen from the Space Shuttle Discovery. (credit:modification of work by NASA)


The angular magnification M of a reflecting telescope is also given by Equation 2.36. For a spherical mirror, the focallength is half the radius of curvature, so making a large objective mirror not only helps the telescope collect more light butalso increases the magnification of the image.


102 Chapter 2 | Geometric Optics and Image Formation


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




aberration
accommodation
angular magnification
apparent depth
Cassegrain design
charge-coupled device (CCD)
coma
compound microscope
concave mirror
converging (or convex) lens
convex mirror
curved mirror
diverging (or concave) lens
eyepiece
far point
farsightedness (or hyperopia)


first focus or object focus
focal length
focal plane
focal point
image distance
linear magnification
magnification
near point
nearsightedness (or myopia)


net magnification
Newtonian design


CHAPTER 2 REVIEW
KEY TERMS


distortion in an image caused by departures from the small-angle approximation
use of the ciliary muscles to adjust the shape of the eye lens for focusing on near or far objects


ratio of the angle subtended by an object observed with a magnifier to that observed by thenaked eye
depth at which an object is perceived to be located with respect to an interface between two media


arrangement of an objective and eyepiece such that the light-gathering concave mirror has a hole inthe middle, and light then is incident on an eyepiece lens
semiconductor chip that converts a light image into tiny pixels that can be convertedinto electronic signals of color and intensity


similar to spherical aberration, but arises when the incoming rays are not parallel to the optical axis
microscope constructed from two convex lenses, the first serving as the eyepiece and thesecond serving as the objective lens


spherical mirror with its reflecting surface on the inner side of the sphere; the mirror forms a “cave”
lens in which light rays that enter it parallel converge into a single point on the oppositeside


spherical mirror with its reflecting surface on the outer side of the sphere
mirror formed by a curved surface, such as spherical, elliptical, or parabolic


lens that causes light rays to bend away from its optical axis
lens or combination of lenses in an optical instrument nearest to the eye of the observer
furthest point an eye can see in focus


visual defect in which near objects appear blurred because their images are focusedbehind the retina rather than on the retina; a farsighted person can see far objects clearly but near objects appearblurred
object located at this point will result in an image created at infinity on the opposite side ofa spherical interface between two media


distance along the optical axis from the focal point to the optical element that focuses the light rays
plane that contains the focal point and is perpendicular to the optical axis
for a converging lens or mirror, the point at which converging light rays cross; for a diverging lens or mirror,the point from which diverging light rays appear to originate


distance of the image from the central axis of the optical element that produces the image
ratio of image height to object height


ratio of image size to object size
closest point an eye can see in focus


visual defect in which far objects appear blurred because their images are focused infront of the retina rather than on the retina; a nearsighted person can see near objects clearly but far objects appearblurred
(Mnet ) of the compound microscope is the product of the linear magnification of the objective and


the angular magnification of the eyepiece
arrangement of an objective and eyepiece such that the focused light from the concave mirror wasreflected to one side of the tube into an eyepiece


Chapter 2 | Geometric Optics and Image Formation 103




object distance
objective
optical axis
optical power
plane mirror
ray tracing
real image
second focus or image focus


simple magnifier (or magnifying glass)
small-angle approximation


spherical aberration
thin-lens approximation
vertex
virtual image


distance of the object from the central axis of the optical element that produces its image
lens nearest to the object being examined.
axis about which the mirror is rotationally symmetric; you can rotate the mirror about this axis withoutchanging anything
(P) inverse of the focal length of a lens, with the focal length expressed in meters. The optical power P of


a lens is expressed in units of diopters D; that is, 1D = 1/m = 1m−1
plane (flat) reflecting surface


technique that uses geometric constructions to find and characterize the image formed by an optical system
image that can be projected onto a screen because the rays physically go through the image


for a converging interface, the point where a bundle of parallel rays refracting at aspherical interface; for a diverging interface, the point at which the backward continuation of the refracted rays willconverge between two media will focus
converging lens that produces a virtual image of an object that is within thefocal length of the lens


approximation that is valid when the size of a spherical mirror is significantly smallerthan the mirror’s radius; in this approximation, spherical aberration is negligible and the mirror has a well-definedfocal point
distortion in the image formed by a spherical mirror when rays are not all focused at the samepoint


assumption that the lens is very thin compared to the first image distance
point where the mirror’s surface intersects with the optical axis


image that cannot be projected on a screen because the rays do not physically go through the image, theyonly appear to originate from the image


KEY EQUATIONS
Image distance in a plane mirror do = −di
Focal length for a spherical mirror f = R


2


Mirror equation 1
do


+ 1
di


= 1
f


Magnification of a spherical mirror
m =


hi
ho


= −
di
do


Sign convention for mirrors
Focal length f + for concave mirror


− for concave mirror


Object distance do + for real object
− for virtual object


Image distance di + for real image
− for virtual image


Magnification m + for upright image
− for inverted image


Apparent depth equation hi = ⎛⎝n2n1⎞⎠ho


104 Chapter 2 | Geometric Optics and Image Formation


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Spherical interface equation n1
do


+
n2
di


=
n2 − n1


R


The thin-lens equation 1
do


+ 1
di


= 1
f


The lens maker’s equation 1
f
= ⎛⎝


n2
n1


− 1⎞⎠


1
R1


− 1
R2





The magnification m of an object
m ≡


hi
ho


= −
di
do


Optical power P = 1
f


Optical power of thin, closely spaced lenses Ptotal = Plens1 + Plens2 + Plens3 +⋯
Angular magnification M of a simple magnifier


M =
θimage
θobject


Angular magnification of an object a distanceL from the eye for a convex lens of focal lengthf held a distance ℓ from the eye
M = ⎛⎝


25 cm
L



⎝1 +


L − ℓ
f



Range of angular magnification for a givenlens for a person with a near point of 25 cm 25 cmf ≤ M ≤ 1 + 25 cmf
Net magnification of compound microscope


Mnet = m
obj


M
eye


= −
di
obj ⎛
⎝ f


eye
+ 25 cm⎞⎠


f
obj


f
eye


SUMMARY
2.1 Images Formed by Plane Mirrors


• A plane mirror always forms a virtual image (behind the mirror).
• The image and object are the same distance from a flat mirror, the image size is the same as the object size, and theimage is upright.


2.2 Spherical Mirrors
• Spherical mirrors may be concave (converging) or convex (diverging).
• The focal length of a spherical mirror is one-half of its radius of curvature: f = R/2 .
• The mirror equation and ray tracing allow you to give a complete description of an image formed by a sphericalmirror.
• Spherical aberration occurs for spherical mirrors but not parabolic mirrors; comatic aberration occurs for both typesof mirrors.


2.3 Images Formed by Refraction
This section explains how a single refracting interface forms images.


• When an object is observed through a plane interface between two media, then it appears at an apparent distance
hi that differs from the actual distance ho : hi = (n2/n1)ho .


• An image is formed by the refraction of light at a spherical interface between two media of indices of refraction n1
and n2 .


Chapter 2 | Geometric Optics and Image Formation 105




• Image distance depends on the radius of curvature of the interface, location of the object, and the indices ofrefraction of the media.
2.4 Thin Lenses


• Two types of lenses are possible: converging and diverging. A lens that causes light rays to bend toward (awayfrom) its optical axis is a converging (diverging) lens.
• For a converging lens, the focal point is where the converging light rays cross; for a diverging lens, the focal pointis the point from which the diverging light rays appear to originate.
• The distance from the center of a thin lens to its focal point is called the focal length f.
• Ray tracing is a geometric technique to determine the paths taken by light rays through thin lenses.
• A real image can be projected onto a screen.
• A virtual image cannot be projected onto a screen.
• A converging lens forms either real or virtual images, depending on the object location; a diverging lens forms onlyvirtual images.


2.5 The Eye
• Image formation by the eye is adequately described by the thin-lens equation.
• The eye produces a real image on the retina by adjusting its focal length in a process called accommodation.
• Nearsightedness, or myopia, is the inability to see far objects and is corrected with a diverging lens to reduce theoptical power of the eye.
• Farsightedness, or hyperopia, is the inability to see near objects and is corrected with a converging lens to increasethe optical power of the eye.
• In myopia and hyperopia, the corrective lenses produce images at distances that fall between the person’s near andfar points so that images can be seen clearly.


2.6 The Camera
• Cameras use combinations of lenses to create an image for recording.
• Digital photography is based on charge-coupled devices (CCDs) that break an image into tiny “pixels” that can beconverted into electronic signals.


2.7 The Simple Magnifier
• A simple magnifier is a converging lens and produces a magnified virtual image of an object located within thefocal length of the lens.
• Angular magnification accounts for magnification of an image created by a magnifier. It is equal to the ratio of theangle subtended by the image to that subtended by the object when the object is observed by the unaided eye.
• Angular magnification is greater for magnifying lenses with smaller focal lengths.
• Simple magnifiers can produce as great as tenfold ( 10 × ) magnification.


2.8 Microscopes and Telescopes
• Many optical devices contain more than a single lens or mirror. These are analyzed by considering each elementsequentially. The image formed by the first is the object for the second, and so on. The same ray-tracing and thin-lens techniques developed in the previous sections apply to each lens element.
• The overall magnification of a multiple-element system is the product of the linear magnifications of its individualelements times the angular magnification of the eyepiece. For a two-element system with an objective and aneyepiece, this is


M = m
obj


M
eye


.


106 Chapter 2 | Geometric Optics and Image Formation


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




where mobj is the linear magnification of the objective and Meye is the angular magnification of the eyepiece.
• The microscope is a multiple-element system that contains more than a single lens or mirror. It allows us to seedetail that we could not to see with the unaided eye. Both the eyepiece and objective contribute to the magnification.The magnification of a compound microscope with the image at infinity is


Mnet = −
(16 cm)(25 cm)


f
obj


f
eye


.


In this equation, 16 cm is the standardized distance between the image-side focal point of the objective lens and
the object-side focal point of the eyepiece, 25 cm is the normal near point distance, f obj and f eye are the focal
distances for the objective lens and the eyepiece, respectively.


• Simple telescopes can be made with two lenses. They are used for viewing objects at large distances.
• The angular magnification M for a telescope is given by


M = −
f
obj


f
eye ,


where f obj and f eye are the focal lengths of the objective lens and the eyepiece, respectively.


CONCEPTUAL QUESTIONS
2.1 Images Formed by Plane Mirrors
1. What are the differences between real and virtualimages? How can you tell (by looking) whether an imageformed by a single lens or mirror is real or virtual?
2. Can you see a virtual image? Explain your response.
3. Can you photograph a virtual image?
4. Can you project a virtual image onto a screen?
5. Is it necessary to project a real image onto a screen tosee it?
6. Devise an arrangement of mirrors allowing you to seethe back of your head. What is the minimum number ofmirrors needed for this task?
7. If you wish to see your entire body in a flat mirror (fromhead to toe), how tall should the mirror be? Does its sizedepend upon your distance away from the mirror? Providea sketch.


2.2 Spherical Mirrors
8. At what distance is an image always located: at
do, di, or f ?


9. Under what circumstances will an image be located atthe focal point of a spherical lens or mirror?
10. What is meant by a negative magnification? What ismeant by a magnification whose absolute value is less thanone?
11. Can an image be larger than the object even though itsmagnification is negative? Explain.


2.3 Images Formed by Refraction
12. Derive the formula for the apparent depth of a fish ina fish tank using Snell’s law.
13. Use a ruler and a protractor to find the image byrefraction in the following cases. Assume an air-glassinterface. Use a refractive index of 1 for air and of 1.5 forglass. (Hint: Use Snell’s law at the interface.)
(a) A point object located on the axis of a concave interfacelocated at a point within the focal length from the vertex.
(b) A point object located on the axis of a concave interfacelocated at a point farther than the focal length from thevertex.
(c) A point object located on the axis of a convex interfacelocated at a point within the focal length from the vertex.
(d) A point object located on the axis of a convex interfacelocated at a point farther than the focal length from thevertex.


Chapter 2 | Geometric Optics and Image Formation 107




(e) Repeat (a)–(d) for a point object off the axis.


2.4 Thin Lenses
14. You can argue that a flat piece of glass, such as in awindow, is like a lens with an infinite focal length. If so,where does it form an image? That is, how are di and do
related?
15. When you focus a camera, you adjust the distance ofthe lens from the film. If the camera lens acts like a thinlens, why can it not be a fixed distance from the film forboth near and distant objects?
16. A thin lens has two focal points, one on either sideof the lens at equal distances from its center, and shouldbehave the same for light entering from either side. Lookbackward and forward through a pair of eyeglasses andcomment on whether they are thin lenses.
17. Will the focal length of a lens change when it issubmerged in water? Explain.


2.5 The Eye
18. If the lens of a person’s eye is removed because ofcataracts (as has been done since ancient times), why wouldyou expect an eyeglass lens of about 16 D to be prescribed?
19. When laser light is shone into a relaxed normal-visioneye to repair a tear by spot-welding the retina to the back of


the eye, the rays entering the eye must be parallel. Why?
20. Why is your vision so blurry when you open youreyes while swimming under water? How does a face maskenable clear vision?
21. It has become common to replace the cataract-cloudedlens of the eye with an internal lens. This intraocular lenscan be chosen so that the person has perfect distant vision.Will the person be able to read without glasses? If theperson was nearsighted, is the power of the intraocular lensgreater or less than the removed lens?
22. If the cornea is to be reshaped (this can be donesurgically or with contact lenses) to correct myopia, shouldits curvature be made greater or smaller? Explain.


2.8 Microscopes and Telescopes
23. Geometric optics describes the interaction of lightwith macroscopic objects. Why, then, is it correct to usegeometric optics to analyze a microscope’s image?
24. The image produced by the microscope in Figure2.38 cannot be projected. Could extra lenses or mirrorsproject it? Explain.
25. If you want your microscope or telescope to projecta real image onto a screen, how would you change theplacement of the eyepiece relative to the objective?


PROBLEMS
2.1 Images Formed by Plane Mirrors
26. Consider a pair of flat mirrors that are positioned sothat they form an angle of 120 ° . An object is placed on the
bisector between the mirrors. Construct a ray diagram as inFigure 2.4 to show how many images are formed.
27. Consider a pair of flat mirrors that are positioned sothat they form an angle of 60 ° . An object is placed on the
bisector between the mirrors. Construct a ray diagram as inFigure 2.4 to show how many images are formed.
28. By using more than one flat mirror, construct a raydiagram showing how to create an inverted image.


2.2 Spherical Mirrors
29. The following figure shows a light bulb between twospherical mirrors. One mirror produces a beam of light with


parallel rays; the other keeps light from escaping withoutbeing put into the beam. Where is the filament of the lightin relation to the focal point or radius of curvature of eachmirror?


30. Why are diverging mirrors often used for rearviewmirrors in vehicles? What is the main disadvantage of using


108 Chapter 2 | Geometric Optics and Image Formation


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




such a mirror compared with a flat one?
31. Some telephoto cameras use a mirror rather than alens. What radius of curvature mirror is needed to replace a800 mm-focal length telephoto lens?
32. Calculate the focal length of a mirror formed by theshiny back of a spoon that has a 3.00 cm radius ofcurvature.
33. Electric room heaters use a concave mirror to reflectinfrared (IR) radiation from hot coils. Note that IR radiationfollows the same law of reflection as visible light. Giventhat the mirror has a radius of curvature of 50.0 cm andproduces an image of the coils 3.00 m away from themirror, where are the coils?
34. Find the magnification of the heater element in theprevious problem. Note that its large magnitude helpsspread out the reflected energy.
35. What is the focal length of a makeup mirror thatproduces a magnification of 1.50 when a person’s face is12.0 cm away? Explicitly show how you follow the steps inthe Problem-Solving Strategy: Spherical Mirrors.
36. A shopper standing 3.00 m from a convex securitymirror sees his image with a magnification of 0.250. (a)Where is his image? (b) What is the focal length of themirror? (c) What is its radius of curvature?
37. An object 1.50 cm high is held 3.00 cm from aperson’s cornea, and its reflected image is measured to be0.167 cm high. (a) What is the magnification? (b) Where isthe image? (c) Find the radius of curvature of the convexmirror formed by the cornea. (Note that this technique isused by optometrists to measure the curvature of the corneafor contact lens fitting. The instrument used is called akeratometer, or curve measurer.)
38. Ray tracing for a flat mirror shows that the image islocated a distance behind the mirror equal to the distanceof the object from the mirror. This is stated as di = −do ,
since this is a negative image distance (it is a virtual image).What is the focal length of a flat mirror?
39. Show that, for a flat mirror, hi = ho , given that the
image is the same distance behind the mirror as the distanceof the object from the mirror.
40. Use the law of reflection to prove that the focal lengthof a mirror is half its radius of curvature. That is, prove that
f = R/2 . Note this is true for a spherical mirror only if its
diameter is small compared with its radius of curvature.


41. Referring to the electric room heater considered in
problem 5, calculate the intensity of IR radiation in W/m2
projected by the concave mirror on a person 3.00 m away.Assume that the heating element radiates 1500 W and has
an area of 100 cm2 , and that half of the radiated power is
reflected and focused by the mirror.
42. Two mirrors are inclined at an angle of 60 ° and an
object is placed at a point that is equidistant from the twomirrors. Use a protractor to draw rays accurately and locateall images. You may have to draw several figures so thatthat rays for different images do not clutter your drawing.
43. Two parallel mirrors are facing each other and areseparated by a distance of 3 cm. A point object is placedbetween the mirrors 1 cm from one of the mirrors. Find thecoordinates of all the images.


2.3 Images Formed by Refraction
44. An object is located in air 30 cm from the vertex of aconcave surface made of glass with a radius of curvature 10cm. Where does the image by refraction form and what isits magnification? Use nair = 1 and nglass = 1.5 .


45. An object is located in air 30 cm from the vertex of aconvex surface made of glass with a radius of curvature 80cm. Where does the image by refraction form and what isits magnification?
46. An object is located in water 15 cm from the vertex ofa concave surface made of glass with a radius of curvature10 cm. Where does the image by refraction form and whatis its magnification? Use nwater = 4/3 and nglass = 1.5 .


47. An object is located in water 30 cm from the vertexof a convex surface made of Plexiglas with a radius ofcurvature of 80 cm. Where does the image form byrefraction and what is its magnification? nwater = 4/3 and
nPlexiglas = 1.65 .


48. An object is located in air 5 cm from the vertex of aconcave surface made of glass with a radius of curvature 20cm. Where does the image form by refraction and what isits magnification? Use nair = 1 and nglass = 1.5 .


49. Derive the spherical interface equation for refractionat a concave surface. (Hint: Follow the derivation in the textfor the convex surface.)


Chapter 2 | Geometric Optics and Image Formation 109




2.4 Thin Lenses
50. How far from the lens must the film in a camera be,if the lens has a 35.0-mm focal length and is being used tophotograph a flower 75.0 cm away? Explicitly show howyou follow the steps in the Problem-Solving Strategy:Lenses.
51. A certain slide projector has a 100 mm-focal lengthlens. (a) How far away is the screen if a slide is placed103 mm from the lens and produces a sharp image? (b) Ifthe slide is 24.0 by 36.0 mm, what are the dimensions ofthe image? Explicitly show how you follow the steps in theProblem-Solving Strategy: Lenses.
52. A doctor examines a mole with a 15.0-cm focal lengthmagnifying glass held 13.5 cm from the mole. (a) Where isthe image? (b) What is its magnification? (c) How big is theimage of a 5.00 mm diameter mole?
53. A camera with a 50.0-mm focal length lens is beingused to photograph a person standing 3.00 m away. (a) Howfar from the lens must the film be? (b) If the film is 36.0mm high, what fraction of a 1.75-m-tall person will fit onit? (c) Discuss how reasonable this seems, based on yourexperience in taking or posing for photographs.
54. A camera lens used for taking close-up photographshas a focal length of 22.0 mm. The farthest it can be placedfrom the film is 33.0 mm. (a) What is the closest object thatcan be photographed? (b) What is the magnification of thisclosest object?
55. Suppose your 50.0 mm-focal length camera lens is51.0 mm away from the film in the camera. (a) How faraway is an object that is in focus? (b) What is the height ofthe object if its image is 2.00 cm high?
56. What is the focal length of a magnifying glass thatproduces a magnification of 3.00 when held 5.00 cm froman object, such as a rare coin?
57. The magnification of a book held 7.50 cm from a10.0 cm-focal length lens is 3.00. (a) Find the magnificationfor the book when it is held 8.50 cm from the magnifier.(b) Repeat for the book held 9.50 cm from the magnifier.(c) Comment on how magnification changes as the objectdistance increases as in these two calculations.
58. Suppose a 200 mm-focal length telephoto lens is beingused to photograph mountains 10.0 km away. (a) Where isthe image? (b) What is the height of the image of a 1000 mhigh cliff on one of the mountains?
59. A camera with a 100 mm-focal length lens is usedto photograph the sun. What is the height of the image of


the sun on the film, given the sun is 1.40 × 106 km in
diameter and is 1.50 × 108 km away?
60. Use the thin-lens equation to show that themagnification for a thin lens is determined by its focallength and the object distance and is given by
m = f /( f − do) .
61. An object of height 3.0 cm is placed 5.0 cm in front ofa converging lens of focal length 20 cm and observed fromthe other side. Where and how large is the image?
62. An object of height 3.0 cm is placed at 5.0 cm in frontof a diverging lens of focal length 20 cm and observed fromthe other side. Where and how large is the image?
63. An object of height 3.0 cm is placed at 25 cm infront of a diverging lens of focal length 20 cm. Behind thediverging lens, there is a converging lens of focal length20 cm. The distance between the lenses is 5.0 cm. Find thelocation and size of the final image.
64. Two convex lenses of focal lengths 20 cm and 10cm are placed 30 cm apart, with the lens with the longerfocal length on the right. An object of height 2.0 cm isplaced midway between them and observed through eachlens from the left and from the right. Describe what youwill see, such as where the image(s) will appear, whetherthey will be upright or inverted and their magnifications.


2.5 The Eye
Unless otherwise stated, the lens-to-retina distance is 2.00cm.
65. What is the power of the eye when viewing an object50.0 cm away?
66. Calculate the power of the eye when viewing an object3.00 m away.
67. The print in many books averages 3.50 mm in height.How high is the image of the print on the retina when thebook is held 30.0 cm from the eye?
68. Suppose a certain person’s visual acuity is such thathe can see objects clearly that form an image 4.00 µm
high on his retina. What is the maximum distance at whichhe can read the 75.0-cm-high letters on the side of anairplane?
69. People who do very detailed work close up, suchas jewelers, often can see objects clearly at much closerdistance than the normal 25 cm. (a) What is the power ofthe eyes of a woman who can see an object clearly at a


110 Chapter 2 | Geometric Optics and Image Formation


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




distance of only 8.00 cm? (b) What is the image size of a1.00-mm object, such as lettering inside a ring, held at thisdistance? (c) What would the size of the image be if theobject were held at the normal 25.0 cm distance?
70. What is the far point of a person whose eyes have arelaxed power of 50.5 D?
71. What is the near point of a person whose eyes have anaccommodated power of 53.5 D?
72. (a) A laser reshaping the cornea of a myopic patientreduces the power of his eye by 9.00 D, with a ±5.0 %
uncertainty in the final correction. What is the range ofdiopters for eyeglass lenses that this person might needafter this procedure? (b) Was the person nearsighted orfarsighted before the procedure? How do you know?
73. The power for normal close vision is 54.0 D. In avision-correction procedure, the power of a patient’s eye isincreased by 3.00 D. Assuming that this produces normalclose vision, what was the patient’s near point before theprocedure?
74. For normal distant vision, the eye has a power of 50.0D. What was the previous far point of a patient who hadlaser vision correction that reduced the power of her eye by7.00 D, producing normal distant vision?
75. The power for normal distant vision is 50.0 D. Aseverely myopic patient has a far point of 5.00 cm. By howmany diopters should the power of his eye be reduced inlaser vision correction to obtain normal distant vision forhim?
76. A student’s eyes, while reading the blackboard, have apower of 51.0 D. How far is the board from his eyes?
77. The power of a physician’s eyes is 53.0 D whileexamining a patient. How far from her eyes is the objectthat is being examined?
78. The normal power for distant vision is 50.0 D. Ayoung woman with normal distant vision has a 10.0%ability to accommodate (that is, increase) the power of hereyes. What is the closest object she can see clearly?
79. The far point of a myopic administrator is 50.0 cm.(a) What is the relaxed power of his eyes? (b) If he has thenormal 8.00% ability to accommodate, what is the closestobject he can see clearly?
80. A very myopic man has a far point of 20.0 cm. Whatpower contact lens (when on the eye) will correct hisdistant vision?


81. Repeat the previous problem for eyeglasses held 1.50cm from the eyes.
82. A myopic person sees that her contact lensprescription is –4.00 D. What is her far point?
83. Repeat the previous problem for glasses that are 1.75cm from the eyes.
84. The contact lens prescription for a mildly farsightedperson is 0.750 D, and the person has a near point of 29.0cm. What is the power of the tear layer between the corneaand the lens if the correction is ideal, taking the tear layerinto account?


2.7 The Simple Magnifier
85. If the image formed on the retina subtends an angle of
30° and the object subtends an angle of 5° , what is the
magnification of the image?
86. What is the magnification of a magnifying lens with afocal length of 10 cm if it is held 3.0 cm from the eye andthe object is 12 cm from the eye?
87. How far should you hold a 2.1 cm-focal lengthmagnifying glass from an object to obtain a magnificationof 10 × ? Assume you place your eye 5.0 cm from the
magnifying glass.
88. You hold a 5.0 cm-focal length magnifying glass asclose as possible to your eye. If you have a normal nearpoint, what is the magnification?
89. You view a mountain with a magnifying glass of focallength f = 10 cm . What is the magnification?
90. You view an object by holding a 2.5 cm-focal lengthmagnifying glass 10 cm away from it. How far from youreye should you hold the magnifying glass to obtain amagnification of 10 × ?
91. A magnifying glass forms an image 10 cm on theopposite side of the lens from the object, which is 10 cmaway. What is the magnification of this lens for a personwith a normal near point if their eye 12 cm from the object?
92. An object viewed with the naked eye subtends a 2°
angle. If you view the object through a 10 × magnifying
glass, what angle is subtended by the image formed on yourretina?
93. For a normal, relaxed eye, a magnifying glassproduces an angular magnification of 4.0. What is the


Chapter 2 | Geometric Optics and Image Formation 111




largest magnification possible with this magnifying glass?
94. What range of magnification is possible with a 7.0 cm-focal length converging lens?
95. Amagnifying glass produces an angular magnificationof 4.5 when used by a young person with a near point of 18cm. What is the maximum angular magnification obtainedby an older person with a near point of 45 cm?


2.8 Microscopes and Telescopes
96. Amicroscope with an overall magnification of 800 hasan objective that magnifies by 200. (a) What is the angularmagnification of the eyepiece? (b) If there are two otherobjectives that can be used, having magnifications of 100and 400, what other total magnifications are possible?
97. (a) What magnification is produced by a 0.150 cm-focal length microscope objective that is 0.155 cm from theobject being viewed? (b) What is the overall magnificationif an 8 × eyepiece (one that produces an angular
magnification of 8.00) is used?
98. Where does an object need to be placed relative toa microscope for its 0.50 cm-focal length objective toproduce a magnification of −400?
99. An amoeba is 0.305 cm away from the 0.300 cm-focal length objective lens of a microscope. (a) Where isthe image formed by the objective lens? (b) What is thisimage’s magnification? (c) An eyepiece with a 2.00-cmfocal length is placed 20.0 cm from the objective. Whereis the final image? (d) What angular magnification isproduced by the eyepiece? (e) What is the overallmagnification? (See Figure 2.39.)
100. Unreasonable Results Your friends show you animage through a microscope. They tell you that themicroscope has an objective with a 0.500-cm focal lengthand an eyepiece with a 5.00-cm focal length. The resultingoverall magnification is 250,000. Are these viable valuesfor a microscope?
Unless otherwise stated, the lens-to-retina distance is 2.00cm.
101. What is the angular magnification of a telescope thathas a 100 cm-focal length objective and a 2.50 cm-focallength eyepiece?
102. Find the distance between the objective and eyepiecelenses in the telescope in the above problem needed toproduce a final image very far from the observer, wherevision is most relaxed. Note that a telescope is normallyused to view very distant objects.


103. A large reflecting telescope has an objective mirrorwith a 10.0-m radius of curvature. What angularmagnification does it produce when a 3.00 m-focal lengtheyepiece is used?
104. A small telescope has a concave mirror with a 2.00-mradius of curvature for its objective. Its eyepiece is a 4.00cm-focal length lens. (a) What is the telescope’s angularmagnification? (b) What angle is subtended by a 25,000km-diameter sunspot? (c) What is the angle of its telescopicimage?
105. A 7.5 × binocular produces an angular
magnification of −7.50, acting like a telescope. (Mirrors areused to make the image upright.) If the binoculars haveobjective lenses with a 75.0-cm focal length, what is thefocal length of the eyepiece lenses?
106. Construct Your Own Problem Consider a telescopeof the type used by Galileo, having a convex objective anda concave eyepiece as illustrated in part (a) of Figure 2.40.Construct a problem in which you calculate the locationand size of the image produced. Among the things to beconsidered are the focal lengths of the lenses and theirrelative placements as well as the size and location of theobject. Verify that the angular magnification is greater thanone. That is, the angle subtended at the eye by the image isgreater than the angle subtended by the object.
107. Trace rays to find which way the given ray willemerge after refraction through the thin lens in thefollowing figure. Assume thin-lens approximation. (Hint:Pick a point P on the given ray in each case. Treat that pointas an object. Now, find its image Q. Use the rule: All rayson the other side of the lens will either go through Q orappear to be coming from Q.)


108. Copy and draw rays to find the final image in thefollowing diagram. (Hint: Find the intermediate imagethrough lens alone. Use the intermediate image as theobject for the mirror and work with the mirror alone to findthe final image.)


112 Chapter 2 | Geometric Optics and Image Formation


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




109. A concave mirror of radius of curvature 10 cm isplaced 30 cm from a thin convex lens of focal length 15 cm.Find the location and magnification of a small bulb sitting50 cm from the lens by using the algebraic method.
110. An object of height 3 cm is placed at 25 cm in frontof a converging lens of focal length 20 cm. Behind thelens there is a concave mirror of focal length 20 cm. Thedistance between the lens and the mirror is 5 cm. Find thelocation, orientation and size of the final image.
111. An object of height 3 cm is placed at a distance of 25cm in front of a converging lens of focal length 20 cm, to bereferred to as the first lens. Behind the lens there is anotherconverging lens of focal length 20 cm placed 10 cm fromthe first lens. There is a concave mirror of focal length 15cm placed 50 cm from the second lens. Find the location,orientation, and size of the final image.
112. An object of height 2 cm is placed at 50 cm in frontof a diverging lens of focal length 40 cm. Behind the lens,there is a convex mirror of focal length 15 cm placed 30cm from the converging lens. Find the location, orientation,and size of the final image.
113. Two concave mirrors are placed facing each other.One of them has a small hole in the middle. A penny isplaced on the bottom mirror (see the following figure).When you look from the side, a real image of the penny isobserved above the hole. Explain how that could happen.


114. A lamp of height 5 cm is placed 40 cm in front ofa converging lens of focal length 20 cm. There is a planemirror 15 cm behind the lens. Where would you find theimage when you look in the mirror?
115. Parallel rays from a faraway source strike aconverging lens of focal length 20 cm at an angle of 15degrees with the horizontal direction. Find the verticalposition of the real image observed on a screen in the focalplane.


116. Parallel rays from a faraway source strike a diverginglens of focal length 20 cm at an angle of 10 degrees with thehorizontal direction. As you look through the lens, where inthe vertical plane the image would appear?
117. A light bulb is placed 10 cm from a plane mirror,which faces a convex mirror of radius of curvature 8 cm.The plane mirror is located at a distance of 30 cm fromthe vertex of the convex mirror. Find the location of twoimages in the convex mirror. Are there other images? If so,where are they located?
118. A point source of light is 50 cm in front of aconverging lens of focal length 30 cm. A concave mirrorwith a focal length of 20 cm is placed 25 cm behind thelens. Where does the final image form, and what are itsorientation and magnification?
119. Copy and trace to find how a horizontal ray from Scomes out after the lens. Use nglass = 1.5 for the prism
material.


120. Copy and trace how a horizontal ray from S comesout after the lens. Use n = 1.55 for the glass.


121. Copy and draw rays to figure out the final image.


122. By ray tracing or by calculation, find the place insidethe glass where rays from S converge as a result ofrefraction through the lens and the convex air-glassinterface. Use a ruler to estimate the radius of curvature.


Chapter 2 | Geometric Optics and Image Formation 113




123. A diverging lens has a focal length of 20 cm. What isthe power of the lens in diopters?
124. Two lenses of focal lengths of f1 and f2 are glued
together with transparent material of negligible thickness.Show that the total power of the two lenses simply add.


125. What will be the angular magnification of a convexlens with the focal length 2.5 cm?
126. What will be the formula for the angularmagnification of a convex lens of focal length f if the eye isvery close to the lens and the near point is located a distanceD from the eye?


ADDITIONAL PROBLEMS
127. Use a ruler and a protractor to draw rays to findimages in the following cases.
(a) A point object located on the axis of a concave mirrorlocated at a point within the focal length from the vertex.(b) A point object located on the axis of a concave mirrorlocated at a point farther than the focal length from thevertex.(c) A point object located on the axis of a convex mirrorlocated at a point within the focal length from the vertex.(d) A point object located on the axis of a convex mirrorlocated at a point farther than the focal length from thevertex.(e) Repeat (a)–(d) for a point object off the axis.
128. Where should a 3 cm tall object be placed in front ofa concave mirror of radius 20 cm so that its image is realand 2 cm tall?
129. A 3 cm tall object is placed 5 cm in front of a convexmirror of radius of curvature 20 cm. Where is the imageformed? How tall is the image? What is the orientation ofthe image?
130. You are looking for a mirror so that you can see afour-fold magnified virtual image of an object when theobject is placed 5 cm from the vertex of the mirror. Whatkind of mirror you will need? What should be the radius ofcurvature of the mirror?
131. Derive the following equation for a convex mirror:
1
VO


− 1
VI


= − 1
VF


,
where VO is the distance to the object O from vertex V, VIthe distance to the image I from V, and VF is the distanceto the focal point F from V. (Hint: use two sets of similartriangles.)
132. (a) Draw rays to form the image of a vertical objecton the optical axis and farther than the focal point froma converging lens. (b) Use plane geometry in your figureand prove that the magnification m is given by
m =


hi
ho


= −
di
do


.


133. Use another ray-tracing diagram for the samesituation as given in the previous problem to derive the
thin-lens equation, 1


do
+ 1
di


= 1
f
.


134. You photograph a 2.0-m-tall person with a camerathat has a 5.0 cm-focal length lens. The image on the filmmust be no more than 2.0 cm high. (a) What is the closestdistance the person can stand to the lens? (b) For thisdistance, what should be the distance from the lens to thefilm?
135. Find the focal length of a thin plano-convex lens. Thefront surface of this lens is flat, and the rear surface hasa radius of curvature of R2 = −35 cm . Assume that the
index of refraction of the lens is 1.5.
136. Find the focal length of a meniscus lens with
R1 = 20 cm and R2 = 15 cm . Assume that the index of
refraction of the lens is 1.5.
137. A nearsighted man cannot see objects clearly beyond20 cm from his eyes. How close must he stand to a mirrorin order to see what he is doing when he shaves?
138. A mother sees that her child’s contact lensprescription is 0.750 D. What is the child’s near point?
139. Repeat the previous problem for glasses that are 2.20cm from the eyes.
140. The contact-lens prescription for a nearsightedperson is −4.00 D and the person has a far point of 22.5 cm.What is the power of the tear layer between the cornea andthe lens if the correction is ideal, taking the tear layer intoaccount?
141. Unreasonable Results A boy has a near point of 50cm and a far point of 500 cm. Will a −4.00 D lens correcthis far point to infinity?
142. Find the angular magnification of an image by amagnifying glass of f = 5.0 cm if the object is placed


114 Chapter 2 | Geometric Optics and Image Formation


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




do = 4.0 cm from the lens and the lens is close to the eye.
143. Let objective and eyepiece of a compoundmicroscope have focal lengths of 2.5 cm and 10 cm,respectively and be separated by 12 cm. A 70-µm object
is placed 6.0 cm from the objective. How large is the virtualimage formed by the objective-eyepiece system?
144. Draw rays to scale to locate the image at the retina ifthe eye lens has a focal length 2.5 cm and the near point is24 cm. (Hint: Place an object at the near point.)
145. The objective and the eyepiece of a microscope havethe focal lengths 3 cm and 10 cm respectively. Decide aboutthe distance between the objective and the eyepiece if weneed a 10 × magnification from the objective/eyepiece
compound system.
146. A far-sighted person has a near point of 100 cm. Howfar in front or behind the retina does the image of an objectplaced 25 cm from the eye form? Use the cornea to retinadistance of 2.5 cm.
147. A near-sighted person has afar point of 80 cm. (a)What kind of corrective lens the person will need if the lensis to be placed 1.5 cm from the eye? (b) What would bethe power of the contact lens needed? Assume distance tocontact lens from the eye to be zero.


148. In a reflecting telescope the objective is a concavemirror of radius of curvature 2 m and an eyepiece is aconvex lens of focal length 5 cm. Find the apparent size ofa 25-m tree at a distance of 10 km that you would perceivewhen looking through the telescope.
149. Two stars that are 109km apart are viewed by a
telescope and found to be separated by an angle of
10−5 radians . If the eyepiece of the telescope has a focal
length of 1.5 cm and the objective has a focal length of 3meters, how far away are the stars from the observer?
150. What is the angular size of the Moon if viewed froma binocular that has a focal length of 1.2 cm for the eyepieceand a focal length of 8 cm for the objective? Use the radius
of the moon 1.74 × 106m and the distance of the moon
from the observer to be 3.8 × 108m .
151. An unknown planet at a distance of 1012m from
Earth is observed by a telescope that has a focal length ofthe eyepiece of 1 cm and a focal length of the objectiveof 1 m. If the far away planet is seen to subtend an angle
of 10−5 radian at the eyepiece, what is the size of the
planet?


Chapter 2 | Geometric Optics and Image Formation 115




116 Chapter 2 | Geometric Optics and Image Formation


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




3 | INTERFERENCE


Figure 3.1 Soap bubbles are blown from clear fluid into very thin films. The colors we see are not due to any pigmentation butare the result of light interference, which enhances specific wavelengths for a given thickness of the film.


Chapter Outline
3.1 Young's Double-Slit Interference
3.2 Mathematics of Interference
3.3 Multiple-Slit Interference
3.4 Interference in Thin Films
3.5 The Michelson Interferometer


Introduction
The most certain indication of a wave is interference. This wave characteristic is most prominent when the wave interactswith an object that is not large compared with the wavelength. Interference is observed for water waves, sound waves, lightwaves, and, in fact, all types of waves.
If you have ever looked at the reds, blues, and greens in a sunlit soap bubble and wondered how straw-colored soapy watercould produce them, you have hit upon one of the many phenomena that can only be explained by the wave character of light(see Figure 3.1). The same is true for the colors seen in an oil slick or in the light reflected from a DVD disc. These andother interesting phenomena cannot be explained fully by geometric optics. In these cases, light interacts with objects andexhibits wave characteristics. The branch of optics that considers the behavior of light when it exhibits wave characteristicsis called wave optics (sometimes called physical optics). It is the topic of this chapter.
3.1 | Young's Double-Slit Interference


Learning Objectives
By the end of this section, you will be able to:
• Explain the phenomenon of interference
• Define constructive and destructive interference for a double slit


The Dutch physicist Christiaan Huygens (1629–1695) thought that light was a wave, but Isaac Newton did not. Newtonthought that there were other explanations for color, and for the interference and diffraction effects that were observable atthe time. Owing to Newton’s tremendous reputation, his view generally prevailed; the fact that Huygens’s principle workedwas not considered direct evidence proving that light is a wave. The acceptance of the wave character of light came manyyears later in 1801, when the English physicist and physician Thomas Young (1773–1829) demonstrated optical interferencewith his now-classic double-slit experiment.
If there were not one but two sources of waves, the waves could be made to interfere, as in the case of waves on


Chapter 3 | Interference 117




water (Figure 3.2). If light is an electromagnetic wave, it must therefore exhibit interference effects under appropriatecircumstances. In Young’s experiment, sunlight was passed through a pinhole on a board. The emerging beam fell on twopinholes on a second board. The light emanating from the two pinholes then fell on a screen where a pattern of bright anddark spots was observed. This pattern, called fringes, can only be explained through interference, a wave phenomenon.


Figure 3.2 Photograph of an interference pattern produced bycircular water waves in a ripple tank. Two thin plungers arevibrated up and down in phase at the surface of the water.Circular water waves are produced by and emanate from eachplunger. The points where the water is calm (corresponding todestructive interference) are clearly visible.


We can analyze double-slit interference with the help of Figure 3.3, which depicts an apparatus analogous to Young’s.Light from a monochromatic source falls on a slit S0 . The light emanating from S0 is incident on two other slits S1 and
S2 that are equidistant from S0 . A pattern of interference fringes on the screen is then produced by the light emanating
from S1 and S2 . All slits are assumed to be so narrow that they can be considered secondary point sources for Huygens’
wavelets (The Nature of Light). Slits S1 and S2 are a distance d apart ( d ≤ 1mm ), and the distance between the
screen and the slits is D( ≈ 1 m) , which is much greater than d.


118 Chapter 3 | Interference


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 3.3 The double-slit interference experiment using monochromatic light and narrow slits.Fringes produced by interfering Huygens wavelets from slits S1 and S2 are observed on the screen.


Since S0 is assumed to be a point source of monochromatic light, the secondary Huygens wavelets leaving S1 and S2
always maintain a constant phase difference (zero in this case because S1 and S2 are equidistant from S0 ) and have
the same frequency. The sources S1 and S2 are then said to be coherent. By coherent waves, we mean the waves are in
phase or have a definite phase relationship. The term incoherent means the waves have random phase relationships, whichwould be the case if S1 and S2 were illuminated by two independent light sources, rather than a single source S0 . Two
independent light sources (which may be two separate areas within the same lamp or the Sun) would generally not emittheir light in unison, that is, not coherently. Also, because S1 and S2 are the same distance from S0 , the amplitudes of
the two Huygens wavelets are equal.
Young used sunlight, where each wavelength forms its own pattern, making the effect more difficult to see. In the followingdiscussion, we illustrate the double-slit experiment with monochromatic light (single λ ) to clarify the effect. Figure 3.4
shows the pure constructive and destructive interference of two waves having the same wavelength and amplitude.


Figure 3.4 The amplitudes of waves add. (a) Pure constructive interference is obtained when identical waves are in phase. (b)Pure destructive interference occurs when identical waves are exactly out of phase, or shifted by half a wavelength.


Chapter 3 | Interference 119




When light passes through narrow slits, the slits act as sources of coherent waves and light spreads out as semicircularwaves, as shown in Figure 3.5(a). Pure constructive interference occurs where the waves are crest to crest or trough totrough. Pure destructive interference occurs where they are crest to trough. The light must fall on a screen and be scatteredinto our eyes for us to see the pattern. An analogous pattern for water waves is shown in Figure 3.2. Note that regions ofconstructive and destructive interference move out from the slits at well-defined angles to the original beam. These anglesdepend on wavelength and the distance between the slits, as we shall see below.


Figure 3.5 Double slits produce two coherent sources of waves that interfere. (a) Lightspreads out (diffracts) from each slit, because the slits are narrow. These waves overlap andinterfere constructively (bright lines) and destructively (dark regions). We can only see thisif the light falls onto a screen and is scattered into our eyes. (b) When light that has passedthrough double slits falls on a screen, we see a pattern such as this.


To understand the double-slit interference pattern, consider how two waves travel from the slits to the screen (Figure 3.6).Each slit is a different distance from a given point on the screen. Thus, different numbers of wavelengths fit into each path.Waves start out from the slits in phase (crest to crest), but they may end up out of phase (crest to trough) at the screen ifthe paths differ in length by half a wavelength, interfering destructively. If the paths differ by a whole wavelength, then thewaves arrive in phase (crest to crest) at the screen, interfering constructively. More generally, if the path length difference
Δl between the two waves is any half-integral number of wavelengths [(1 / 2) λ , (3 / 2) λ , (5 / 2) λ , etc.], then destructive
interference occurs. Similarly, if the path length difference is any integral number of wavelengths ( λ , 2 λ , 3 λ , etc.), then
constructive interference occurs. These conditions can be expressed as equations:


(3.1)Δl = mλ, for m = 0, ±1, ±2, ±3 … (constructive interference)
(3.2)Δl = (m + 1


2
)λ, for m = 0, ±1, ±2, ±3 … (destructive interference)


120 Chapter 3 | Interference


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 3.6 Waves follow different paths from the slits to acommon point P on a screen. Destructive interference occurswhere one path is a half wavelength longer than the other—thewaves start in phase but arrive out of phase. Constructiveinterference occurs where one path is a whole wavelength longerthan the other—the waves start out and arrive in phase.


3.2 | Mathematics of Interference
Learning Objectives


By the end of this section, you will be able to:
• Determine the angles for bright and dark fringes for double slit interference
• Calculate the positions of bright fringes on a screen


Figure 3.7(a) shows how to determine the path length difference Δl for waves traveling from two slits to a common point
on a screen. If the screen is a large distance away compared with the distance between the slits, then the angle θ between
the path and a line from the slits to the screen [part (b)] is nearly the same for each path. In other words, r1 and r2 are
essentially parallel. The lengths of r1 and r2 differ by Δl , as indicated by the two dashed lines in the figure. Simple
trigonometry shows


(3.3)Δl = d sin θ
where d is the distance between the slits. Combining this result with Equation 3.1, we obtain constructive interference fora double slit when the path length difference is an integral multiple of the wavelength, or


(3.4)d sin θ = mλ, for m = 0, ±1, ±2, ±3,… (constructive interference).
Similarly, to obtain destructive interference for a double slit, the path length difference must be a half-integral multiple ofthe wavelength, or


(3.5)d sin θ = (m + 1
2
)λ, for m = 0, ±1, ±2, ±3,… (destructive interference)


where λ is the wavelength of the light, d is the distance between slits, and θ is the angle from the original direction of the
beam as discussed above. We call m the order of the interference. For example, m = 4 is fourth-order interference.


Chapter 3 | Interference 121




Figure 3.7 (a) To reach P, the light waves from S1 and S2 must travel different distances. (b) The path difference
between the two rays is Δl .


The equations for double-slit interference imply that a series of bright and dark lines are formed. For vertical slits, the lightspreads out horizontally on either side of the incident beam into a pattern called interference fringes (Figure 3.8). Thecloser the slits are, the more the bright fringes spread apart. We can see this by examining the equation
d sin θ = mλ, for m = 0, ±1, ±2, ±3… . For fixed λ and m, the smaller d is, the larger θ must be, since sin θ = mλ/d .
This is consistent with our contention that wave effects are most noticeable when the object the wave encounters (here, slitsa distance d apart) is small. Small d gives large θ , hence, a large effect.
Referring back to part (a) of the figure, θ is typically small enough that sin θ ≈ tan θ ≈ ym/D , where ym is the distance
from the central maximum to the mth bright fringe and D is the distance between the slit and the screen. Equation 3.4maythen be written as


d
ym
D


= mλ


or
(3.6)ym = mλDd .


122 Chapter 3 | Interference


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 3.8 The interference pattern for a double slit has an intensity that falls off with angle. The image showsmultiple bright and dark lines, or fringes, formed by light passing through a double slit.


Example 3.1
Finding a Wavelength from an Interference Pattern
Suppose you pass light from a He-Ne laser through two slits separated by 0.0100 mm and find that the third brightline on a screen is formed at an angle of 10.95° relative to the incident beam. What is the wavelength of the
light?
Strategy
The phenomenon is two-slit interference as illustrated in Figure 3.8 and the third bright line is due to third-order constructive interference, which means that m = 3 . We are given d = 0.0100 mm and θ = 10.95° . The
wavelength can thus be found using the equation d sin θ = mλ for constructive interference.
Solution
Solving d sin θ = mλ for the wavelength λ gives


λ = d sin θm .


Substituting known values yields
λ = (0.0100 mm)(sin 10.95°)


3
= 6.33 × 10−4 mm = 633 nm.


Significance
To three digits, this is the wavelength of light emitted by the common He-Ne laser. Not by coincidence, this redcolor is similar to that emitted by neon lights. More important, however, is the fact that interference patterns canbe used to measure wavelength. Young did this for visible wavelengths. This analytical techinque is still widelyused to measure electromagnetic spectra. For a given order, the angle for constructive interference increases with
λ , so that spectra (measurements of intensity versus wavelength) can be obtained.


Chapter 3 | Interference 123




3.1


Example 3.2
Calculating the Highest Order Possible
Interference patterns do not have an infinite number of lines, since there is a limit to how big m can be. What isthe highest-order constructive interference possible with the system described in the preceding example?
Strategy
The equation d sin θ = mλ (for m = 0, ±1, ±2, ±3… ) describes constructive interference from two slits. For
fixed values of d and λ , the larger m is, the larger sin θ is. However, the maximum value that sin θ can have is
1, for an angle of 90° . (Larger angles imply that light goes backward and does not reach the screen at all.) Let us
find what value of m corresponds to this maximum diffraction angle.
Solution
Solving the equation d sin θ = mλ for m gives


m = d sin θ
λ


.


Taking sin θ = 1 and substituting the values of d and λ from the preceding example gives
m = (0.0100 mm)(1)


633 nm
≈ 15.8.


Therefore, the largest integer m can be is 15, or m = 15 .
Significance
The number of fringes depends on the wavelength and slit separation. The number of fringes is very large for largeslit separations. However, recall (see The Propagation of Light and the introduction for this chapter) that waveinterference is only prominent when the wave interacts with objects that are not large compared to the wavelength.Therefore, if the slit separation and the sizes of the slits become much greater than the wavelength, the intensitypattern of light on the screen changes, so there are simply two bright lines cast by the slits, as expected, whenlight behaves like rays. We also note that the fringes get fainter farther away from the center. Consequently, notall 15 fringes may be observable.


Check Your Understanding In the system used in the preceding examples, at what angles are the firstand the second bright fringes formed?


3.3 | Multiple-Slit Interference
Learning Objectives


By the end of this section, you will be able to:
• Describe the locations and intensities of secondary maxima for multiple-slit interference


Analyzing the interference of light passing through two slits lays out the theoretical framework of interference andgives us a historical insight into Thomas Young’s experiments. However, much of the modern-day application of slitinterference uses not just two slits but many, approaching infinity for practical purposes. The key optical element is calleda diffraction grating, an important tool in optical analysis, which we discuss in detail in Diffraction. Here, we start theanalysis of multiple-slit interference by taking the results from our analysis of the double slit ( N = 2 ) and extending it to
configurations with three, four, and much larger numbers of slits.
Figure 3.9 shows the simplest case of multiple-slit interference, with three slits, or N = 3 . The spacing between slits is d,
and the path length difference between adjacent slits is d sin θ , same as the case for the double slit. What is new is that the
path length difference for the first and the third slits is 2d sin θ . The condition for constructive interference is the same as


124 Chapter 3 | Interference


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




for the double slit, that is
d sin θ = mλ.


When this condition is met, 2d sin θ is automatically a multiple of λ , so all three rays combine constructively, and the
bright fringes that occur here are called principal maxima. But what happens when the path length difference betweenadjacent slits is only λ/2 ? We can think of the first and second rays as interfering destructively, but the third ray remains
unaltered. Instead of obtaining a dark fringe, or a minimum, as we did for the double slit, we see a secondary maximumwith intensity lower than the principal maxima.


Figure 3.9 Interference with three slits. Different pairs ofemerging rays can combine constructively or destructively at thesame time, leading to secondary maxima.


In general, for N slits, these secondary maxima occur whenever an unpaired ray is present that does not go away due todestructive interference. This occurs at (N − 2) evenly spaced positions between the principal maxima. The amplitude of
the electromagnetic wave is correspondingly diminished to 1/N of the wave at the principal maxima, and the light intensity,
being proportional to the square of the wave amplitude, is diminished to 1/N 2 of the intensity compared to the principal
maxima. As Figure 3.10 shows, a dark fringe is located between every maximum (principal or secondary). As N growslarger and the number of bright and dark fringes increase, the widths of the maxima become narrower due to the closelylocated neighboring dark fringes. Because the total amount of light energy remains unaltered, narrower maxima require thateach maximum reaches a correspondingly higher intensity.


Chapter 3 | Interference 125




Figure 3.10 Interference fringe patterns for two, three and four slits. As the number of slits increases, more secondarymaxima appear, but the principal maxima become brighter and narrower. (a) Graph and (b) photographs of fringe patterns.


3.4 | Interference in Thin Films
Learning Objectives


By the end of this section, you will be able to:
• Describe the phase changes that occur upon reflection
• Describe fringes established by reflected rays of a common source
• Explain the appearance of colors in thin films


The bright colors seen in an oil slick floating on water or in a sunlit soap bubble are caused by interference. The brightestcolors are those that interfere constructively. This interference is between light reflected from different surfaces of a thinfilm; thus, the effect is known as thin-film interference.
As we noted before, interference effects are most prominent when light interacts with something having a size similar toits wavelength. A thin film is one having a thickness t smaller than a few times the wavelength of light, λ . Since color is
associated indirectly with λ and because all interference depends in some way on the ratio of λ to the size of the object
involved, we should expect to see different colors for different thicknesses of a film, as in Figure 3.11.


Figure 3.11 These soap bubbles exhibit brilliant colors whenexposed to sunlight. (credit: Scott Robinson)


126 Chapter 3 | Interference


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




What causes thin-film interference? Figure 3.12 shows how light reflected from the top and bottom surfaces of a filmcan interfere. Incident light is only partially reflected from the top surface of the film (ray 1). The remainder enters thefilm and is itself partially reflected from the bottom surface. Part of the light reflected from the bottom surface can emergefrom the top of the film (ray 2) and interfere with light reflected from the top (ray 1). The ray that enters the film travels agreater distance, so it may be in or out of phase with the ray reflected from the top. However, consider for a moment, again,the bubbles in Figure 3.11. The bubbles are darkest where they are thinnest. Furthermore, if you observe a soap bubblecarefully, you will note it gets dark at the point where it breaks. For very thin films, the difference in path lengths of rays 1and 2 in Figure 3.12 is negligible, so why should they interfere destructively and not constructively? The answer is that aphase change can occur upon reflection, as discussed next.


Figure 3.12 Light striking a thin film is partially reflected(ray 1) and partially refracted at the top surface. The refractedray is partially reflected at the bottom surface and emerges asray 2. These rays interfere in a way that depends on thethickness of the film and the indices of refraction of the variousmedia.
Changes in Phase due to Reflection
We saw earlier (Waves (http://cnx.org/content/m58367/latest/) ) that reflection of mechanical waves can involve a
180° phase change. For example, a traveling wave on a string is inverted (i.e., a 180° phase change) upon reflection at
a boundary to which a heavier string is tied. However, if the second string is lighter (or more precisely, of a lower lineardensity), no inversion occurs. Light waves produce the same effect, but the deciding parameter for light is the index ofrefraction. Light waves undergo a 180° or π radians phase change upon reflection at an interface beyond which is a
medium of higher index of refraction. No phase change takes place when reflecting from a medium of lower refractive index(Figure 3.13). Because of the periodic nature of waves, this phase change or inversion is equivalent to ±λ/2 in distance
travelled, or path length. Both the path length and refractive indices are important factors in thin-film interference.


Chapter 3 | Interference 127




Figure 3.13 Reflection at an interface for light traveling from amedium with index of refraction n1 to a medium with index of
refraction n2 , n1 < n2 , causes the phase of the wave to change
by π radians.


If the film in Figure 3.12 is a soap bubble (essentially water with air on both sides), then a phase shift of λ/2 occurs for
ray 1 but not for ray 2. Thus, when the film is very thin and the path length difference between the two rays is negligible,they are exactly out of phase, and destructive interference occurs at all wavelengths. Thus, the soap bubble is dark here.The thickness of the film relative to the wavelength of light is the other crucial factor in thin-film interference. Ray 2 inFigure 3.12 travels a greater distance than ray 1. For light incident perpendicular to the surface, ray 2 travels a distanceapproximately 2t farther than ray 1. When this distance is an integral or half-integral multiple of the wavelength in themedium (λn = λ/n , where λ is the wavelength in vacuum and n is the index of refraction), constructive or destructive
interference occurs, depending also on whether there is a phase change in either ray.
Example 3.3


Calculating the Thickness of a Nonreflective Lens Coating
Sophisticated cameras use a series of several lenses. Light can reflect from the surfaces of these various lensesand degrade image clarity. To limit these reflections, lenses are coated with a thin layer of magnesium fluoride,which causes destructive thin-film interference. What is the thinnest this film can be, if its index of refractionis 1.38 and it is designed to limit the reflection of 550-nm light, normally the most intense visible wavelength?Assume the index of refraction of the glass is 1.52.
Strategy
Refer to Figure 3.12 and use n1 = 1.00 for air, n2 = 1.38 , and n3 = 1.52 . Both ray 1 and ray 2 have a λ/2
shift upon reflection. Thus, to obtain destructive interference, ray 2 needs to travel a half wavelength farther thanray 1. For rays incident perpendicularly, the path length difference is 2t.
Solution
To obtain destructive interference here,


2t =
λn2
2


where λn2 is the wavelength in the film and is given by λn2 = λ/n2 . Thus,
2t =


λ/n2
2


.


Solving for t and entering known values yields


128 Chapter 3 | Interference


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




t =
λ/n2
4


= (500 nm)/1.38
4


= 99.6 nm.


Significance
Films such as the one in this example are most effective in producing destructive interference when the thinnestlayer is used, since light over a broader range of incident angles is reduced in intensity. These films are callednonreflective coatings; this is only an approximately correct description, though, since other wavelengths are onlypartially cancelled. Nonreflective coatings are also used in car windows and sunglasses.


Combining Path Length Difference with Phase Change
Thin-film interference is most constructive or most destructive when the path length difference for the two rays is an integralor half-integral wavelength. That is, for rays incident perpendicularly,


2t = λn, 2λn, 3λn ,… or 2t = λn/2, 3λn/2, 5λn/2,….


To know whether interference is constructive or destructive, you must also determine if there is a phase change uponreflection. Thin-film interference thus depends on film thickness, the wavelength of light, and the refractive indices. Forwhite light incident on a film that varies in thickness, you can observe rainbow colors of constructive interference forvarious wavelengths as the thickness varies.
Example 3.4


Soap Bubbles
(a) What are the three smallest thicknesses of a soap bubble that produce constructive interference for red lightwith a wavelength of 650 nm? The index of refraction of soap is taken to be the same as that of water. (b) Whatthree smallest thicknesses give destructive interference?
Strategy
Use Figure 3.12 to visualize the bubble, which acts as a thin film between two layers of air. Thus
n1 = n3 = 1.00 for air, and n2 = 1.333 for soap (equivalent to water). There is a λ/2 shift for ray 1 reflected
from the top surface of the bubble and no shift for ray 2 reflected from the bottom surface. To get constructiveinterference, then, the path length difference (2t) must be a half-integral multiple of the wavelength—the firstthree being λn /2, 3λn /2 , and 5λn /2 . To get destructive interference, the path length difference must be an
integral multiple of the wavelength—the first three being 0, λn , and 2λn .
Solution
a. Constructive interference occurs here when


2tc =
λn
2
, 3λn


2
, 5λn


2
, … .


Thus, the smallest constructive thickness tc is
tc =


λn
4


= λ/n
4


= (650 nm)/1.333
4


= 122 nm.


The next thickness that gives constructive interference is tc′ = 3λn /4 , so that
tc′ = 366 nm.


Finally, the third thickness producing constructive interference is tc′ = 5λn /4 , so that
tc′ = 610 nm.


b. For destructive interference, the path length difference here is an integral multiple of the wavelength. The firstoccurs for zero thickness, since there is a phase change at the top surface, that is,


Chapter 3 | Interference 129




3.2


td = 0,


the very thin (or negligibly thin) case discussed above. The first non-zero thickness producing destructiveinterference is
2td′ = λn.


Substituting known values gives
td′ =


λ
2
= λ/n


2
= (650 nm)/1.333


2
= 244 nm.


Finally, the third destructive thickness is 2td″ = 2λn , so that
td″ = λn =


λ
n =


650 nm
1.333


= 488 nm.


Significance
If the bubble were illuminated with pure red light, we would see bright and dark bands at very uniform increasesin thickness. First would be a dark band at 0 thickness, then bright at 122 nm thickness, then dark at 244 nm,bright at 366 nm, dark at 488 nm, and bright at 610 nm. If the bubble varied smoothly in thickness, like a smoothwedge, then the bands would be evenly spaced.


Check Your Understanding Going further with Example 3.4, what are the next two thicknesses ofsoap bubble that would lead to (a) constructive interference, and (b) destructive interference?


Another example of thin-film interference can be seen when microscope slides are separated (see Figure 3.14). The slidesare very flat, so that the wedge of air between them increases in thickness very uniformly. A phase change occurs at thesecond surface but not the first, so a dark band forms where the slides touch. The rainbow colors of constructive interferencerepeat, going from violet to red again and again as the distance between the slides increases. As the layer of air increases, thebands become more difficult to see, because slight changes in incident angle have greater effects on path length differences.If monochromatic light instead of white light is used, then bright and dark bands are obtained rather than repeating rainbowcolors.


130 Chapter 3 | Interference


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 3.14 (a) The rainbow-color bands are produced by thin-film interference in theair between the two glass slides. (b) Schematic of the paths taken by rays in the wedge ofair between the slides. (c) If the air wedge is illuminated with monochromatic light, brightand dark bands are obtained rather than repeating rainbow colors.


An important application of thin-film interference is found in the manufacturing of optical instruments. A lens or mirrorcan be compared with a master as it is being ground, allowing it to be shaped to an accuracy of less than a wavelength overits entire surface. Figure 3.15 illustrates the phenomenon called Newton’s rings, which occurs when the plane surfacesof two lenses are placed together. (The circular bands are called Newton’s rings because Isaac Newton described them andtheir use in detail. Newton did not discover them; Robert Hooke did, and Newton did not believe they were due to the wavecharacter of light.) Each successive ring of a given color indicates an increase of only half a wavelength in the distancebetween the lens and the blank, so that great precision can be obtained. Once the lens is perfect, no rings appear.


Chapter 3 | Interference 131




Figure 3.15 “Newton’s rings” interference fringes are produced when two plano-convexlenses are placed together with their plane surfaces in contact. The rings are created byinterference between the light reflected off the two surfaces as a result of a slight gap betweenthem, indicating that these surfaces are not precisely plane but are slightly convex. (credit: UlfSeifert)


Thin-film interference has many other applications, both in nature and in manufacturing. The wings of certain mothsand butterflies have nearly iridescent colors due to thin-film interference. In addition to pigmentation, the wing’s coloris affected greatly by constructive interference of certain wavelengths reflected from its film-coated surface. Some carmanufacturers offer special paint jobs that use thin-film interference to produce colors that change with angle. Thisexpensive option is based on variation of thin-film path length differences with angle. Security features on credit cards,banknotes, driving licenses, and similar items prone to forgery use thin-film interference, diffraction gratings, or holograms.As early as 1998, Australia led the way with dollar bills printed on polymer with a diffraction grating security feature,making the currency difficult to forge. Other countries, such as Canada, New Zealand, and Taiwan, are using similartechnologies, while US currency includes a thin-film interference effect.
3.5 | The Michelson Interferometer


Learning Objectives
By the end of this section, you will be able to:
• Explain changes in fringes observed with a Michelson interferometer caused by mirrormovements
• Explain changes in fringes observed with a Michelson interferometer caused by changes inmedium


The Michelson interferometer (invented by the American physicist Albert A. Michelson, 1852–1931) is a precisioninstrument that produces interference fringes by splitting a light beam into two parts and then recombining them after theyhave traveled different optical paths. Figure 3.16 depicts the interferometer and the path of a light beam from a single pointon the extended source S, which is a ground-glass plate that diffuses the light from a monochromatic lamp of wavelength
λ0 . The beam strikes the half-silvered mirror M, where half of it is reflected to the side and half passes through the mirror.
The reflected light travels to the movable plane mirror M1 , where it is reflected back through M to the observer. The
transmitted half of the original beam is reflected back by the stationary mirror M2 and then toward the observer by M.


132 Chapter 3 | Interference


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 3.16 (a) The Michelson interferometer. The extended light source is a ground-glass plate that diffuses the light from alaser. (b) A planar view of the interferometer.


Because both beams originate from the same point on the source, they are coherent and therefore interfere. Notice from thefigure that one beam passes through M three times and the other only once. To ensure that both beams traverse the samethickness of glass, a compensator plate C of transparent glass is placed in the arm containing M2 . This plate is a duplicate
of M (without the silvering) and is usually cut from the same piece of glass used to produce M. With the compensator inplace, any phase difference between the two beams is due solely to the difference in the distances they travel.
The path difference of the two beams when they recombine is 2d1 − 2d2 , where d1 is the distance between M and M1 ,
and d2 is the distance between M and M2 . Suppose this path difference is an integer number of wavelengths mλ0 . Then,
constructive interference occurs and a bright image of the point on the source is seen at the observer. Now the light fromany other point on the source whose two beams have this same path difference also undergoes constructive interference andproduces a bright image. The collection of these point images is a bright fringe corresponding to a path difference of mλ0
(Figure 3.17). When M1 is moved a distance Δd = λ0 /2 , this path difference changes by λ0 , and each fringe moves to
the position previously occupied by an adjacent fringe. Consequently, by counting the number of fringes m passing a givenpoint as M1 is moved, an observer can measure minute displacements that are accurate to a fraction of a wavelength, as
shown by the relation


(3.7)
Δd = m


λ0
2
.


Figure 3.17 Fringes produced with a Michelsoninterferometer. (credit: “SILLAGESvideos”/YouTube)


Chapter 3 | Interference 133




Example 3.5
Precise Distance Measurements by Michelson Interferometer
A red laser light of wavelength 630 nm is used in a Michelson interferometer. While keeping the mirror M1
fixed, mirror M2 is moved. The fringes are found to move past a fixed cross-hair in the viewer. Find the distance
the mirror M2 is moved for a single fringe to move past the reference line.
Strategy
Refer to Figure 3.16 for the geometry. We use the result of the Michelson interferometer interference conditionto find the distance moved, Δd .
Solution
For a 630-nm red laser light, and for each fringe crossing (m = 1) , the distance traveled by M2 if you keep M1
fixed is


Δd = m
λ0
2


= 1 × 630 nm
2


= 315 nm = 0.315 µm.


Significance
An important application of this measurement is the definition of the standard meter. As mentioned in Unitsand Measurement (http://cnx.org/content/m58268/latest/) , the length of the standard meter was oncedefined as the mirror displacement in a Michelson interferometer corresponding to 1,650,763.73 wavelengths ofthe particular fringe of krypton-86 in a gas discharge tube.


Example 3.6
Measuring the Refractive Index of a Gas
In one arm of a Michelson interferometer, a glass chamber is placed with attachments for evacuating the insideand putting gases in it. The space inside the container is 2 cm wide. Initially, the container is empty. As gas isslowly let into the chamber, you observe that dark fringes move past a reference line in the field of observation.By the time the chamber is filled to the desired pressure, you have counted 122 fringes move past the referenceline. The wavelength of the light used is 632.8 nm. What is the refractive index of this gas?


Strategy
The m = 122 fringes observed compose the difference between the number of wavelengths that fit within the
empty chamber (vacuum) and the number of wavelengths that fit within the same chamber when it is gas-filled.The wavelength in the filled chamber is shorter by a factor of n, the index of refraction.


134 Chapter 3 | Interference


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




3.3


Solution
The ray travels a distance t = 2 cm to the right through the glass chamber and another distance t to the left upon
reflection. The total travel is L = 2t . When empty, the number of wavelengths that fit in this chamber is


N0 =
L
λ0


= 2t
λ0


where λ0 = 632.8 nm is the wavelength in vacuum of the light used. In any other medium, the wavelength is
λ = λ0 /n and the number of wavelengths that fit in the gas-filled chamber is


N = L
λ
= 2t


λ0/n
.


The number of fringes observed in the transition is
m = N − N0,


= 2t
λ0/n


− 2t
λ0


,


= 2t
λ0


(n − 1).


Solving for (n − 1) gives
n − 1 = m




λ0
2t

⎠ = 122





⎜632.8 × 10


−9 m
2(2 × 10−2 m)





⎟ = 0.0019


and n = 1.0019 .
Significance
The indices of refraction for gases are so close to that of vacuum, that we normally consider them equal to 1. Thedifference between 1 and 1.0019 is so small that measuring it requires a correspondingly sensitive technique suchas interferometry. We cannot, for example, hope to measure this value using techniques based simply on Snell’slaw.


Check Your Understanding Although m, the number of fringes observed, is an integer, which is oftenregarded as having zero uncertainty, in practical terms, it is all too easy to lose track when counting fringes. InExample 3.6, if you estimate that you might have missed as many as five fringes when you reported
m = 122 fringes, (a) is the value for the index of refraction worked out in Example 3.6 too large or too
small? (b) By how much?


Problem-Solving Strategy: Wave Optics
Step 1. Examine the situation to determine that interference is involved. Identify whether slits, thin films, orinterferometers are considered in the problem.
Step 2. If slits are involved, note that diffraction gratings and double slits produce very similar interference patterns,but that gratings have narrower (sharper) maxima. Single-slit patterns are characterized by a large central maximumand smaller maxima to the sides.
Step 3. If thin-film interference or an interferometer is involved, take note of the path length difference between the tworays that interfere. Be certain to use the wavelength in the medium involved, since it differs from the wavelength invacuum. Note also that there is an additional λ/2 phase shift when light reflects from a medium with a greater index
of refraction.
Step 4. Identify exactly what needs to be determined in the problem (identify the unknowns). A written list is useful.Draw a diagram of the situation. Labeling the diagram is useful.
Step 5. Make a list of what is given or can be inferred from the problem as stated (identify the knowns).


Chapter 3 | Interference 135




Step 6. Solve the appropriate equation for the quantity to be determined (the unknown) and enter the knowns. Slits,gratings, and the Rayleigh limit involve equations.
Step 7. For thin-film interference, you have constructive interference for a total shift that is an integral number ofwavelengths. You have destructive interference for a total shift of a half-integral number of wavelengths. Always keepin mind that crest to crest is constructive whereas crest to trough is destructive.
Step 8. Check to see if the answer is reasonable: Does it make sense? Angles in interference patterns cannot be greaterthan 90° , for example.


136 Chapter 3 | Interference


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




coherent waves
fringes
incoherent
interferometer
monochromatic
Newton’s rings
order
principal maximum
secondary maximum
thin-film interference


CHAPTER 3 REVIEW
KEY TERMS


waves are in phase or have a definite phase relationship
bright and dark patterns of interference


waves have random phase relationships
instrument that uses interference of waves to make measurements
light composed of one wavelength only
circular interference pattern created by interference between the light reflected off two surfaces as aresult of a slight gap between them


integer m used in the equations for constructive and destructive interference for a double slit
brightest interference fringes seen with multiple slits
bright interference fringes of intensity lower than the principal maxima
interference between light reflected from different surfaces of a thin film


KEY EQUATIONS
Constructive interference Δl = mλ, for m = 0, ±1, ±2, ±3…
Destructive interference Δl = (m + 1


2
)λ, for m = 0, ±1, ±2, ±3…


Path length difference for waves from two slits to acommon point on a screen Δl = d sin θ
Constructive interference d sin θ = mλ, for m = 0, ±1, ±2, ±3,…
Destructive interference d sin θ = (m + 1


2
)λ, for m = 0,±1, ±2, ±3, …


Distance from central maximum to the mth bright fringe ym = mλDd
Displacement measured by a Michelson interferometer


Δd = m
λ0
2


SUMMARY
3.1 Young's Double-Slit Interference


• Young’s double-slit experiment gave definitive proof of the wave character of light.
• An interference pattern is obtained by the superposition of light from two slits.


3.2 Mathematics of Interference
• In double-slit diffraction, constructive interference occurs when d sin θ = mλ(for m = 0, ±1, ±2, ±3…) , where
d is the distance between the slits, θ is the angle relative to the incident direction, and m is the order of the
interference.


• Destructive interference occurs when d sin θ = (m + 1
2
)λ for m = 0, ±1, ±2, ±3,… .


Chapter 3 | Interference 137




3.3 Multiple-Slit Interference
• Interference from multiple slits ( N > 2 ) produces principal as well as secondary maxima.
• As the number of slits is increased, the intensity of the principal maxima increases and the width decreases.


3.4 Interference in Thin Films
• When light reflects from a medium having an index of refraction greater than that of the medium in which it istraveling, a 180° phase change (or a λ/2 shift) occurs.
• Thin-film interference occurs between the light reflected from the top and bottom surfaces of a film. In addition tothe path length difference, there can be a phase change.


3.5 The Michelson Interferometer
• When the mirror in one arm of the interferometer moves a distance of λ/2 each fringe in the interference pattern
moves to the position previously occupied by the adjacent fringe.


CONCEPTUAL QUESTIONS
3.1 Young's Double-Slit Interference
1. Young’s double-slit experiment breaks a single lightbeam into two sources. Would the same pattern be obtainedfor two independent sources of light, such as the headlightsof a distant car? Explain.
2. Is it possible to create a experimental setup in whichthere is only destructive interference? Explain.
3. Why won’t two small sodium lamps, held closetogether, produce an interference pattern on a distantscreen? What if the sodium lamps were replaced by twolaser pointers held close together?


3.2 Mathematics of Interference
4. Suppose you use the same double slit to performYoung’s double-slit experiment in air and then repeat theexperiment in water. Do the angles to the same parts of theinterference pattern get larger or smaller? Does the color ofthe light change? Explain.
5. Why is monochromatic light used in the double slitexperiment? What would happen if white light were used?


3.4 Interference in Thin Films
6. What effect does increasing the wedge angle have onthe spacing of interference fringes? If the wedge angle istoo large, fringes are not observed. Why?
7. How is the difference in paths taken by two originallyin-phase light waves related to whether they interfereconstructively or destructively? How can this be affected


by reflection? By refraction?
8. Is there a phase change in the light reflected from eithersurface of a contact lens floating on a person’s tear layer?The index of refraction of the lens is about 1.5, and its topsurface is dry.
9. In placing a sample on a microscope slide, a glass coveris placed over a water drop on the glass slide. Light incidentfrom above can reflect from the top and bottom of the glasscover and from the glass slide below the water drop. Atwhich surfaces will there be a phase change in the reflectedlight?
10. Answer the above question if the fluid between thetwo pieces of crown glass is carbon disulfide.
11. While contemplating the food value of a slice of ham,you notice a rainbow of color reflected from its moistsurface. Explain its origin.
12. An inventor notices that a soap bubble is dark at itsthinnest and realizes that destructive interference is takingplace for all wavelengths. How could she use thisknowledge to make a nonreflective coating for lenses thatis effective at all wavelengths? That is, what limits wouldthere be on the index of refraction and thickness of thecoating? How might this be impractical?
13. A nonreflective coating like the one described inExample 3.3 works ideally for a single wavelength andfor perpendicular incidence. What happens for otherwavelengths and other incident directions? Be specific.
14. Why is it much more difficult to see interferencefringes for light reflected from a thick piece of glass than


138 Chapter 3 | Interference


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




from a thin film? Would it be easier if monochromatic lightwere used? 3.5 The Michelson Interferometer
15. Describe how a Michelson interferometer can be usedto measure the index of refraction of a gas (including air).


PROBLEMS
3.2 Mathematics of Interference
16. At what angle is the first-order maximum for 450-nmwavelength blue light falling on double slits separated by0.0500 mm?
17. Calculate the angle for the third-order maximum of580-nm wavelength yellow light falling on double slitsseparated by 0.100 mm.
18. What is the separation between two slits for which610-nm orange light has its first maximum at an angle of
30.0° ?
19. Find the distance between two slits that produces thefirst minimum for 410-nm violet light at an angle of 45.0°.


20. Calculate the wavelength of light that has its thirdminimum at an angle of 30.0° when falling on double slits
separated by 3.00 µm . Explicitly show how you follow
the steps from the Problem-Solving Strategy: WaveOptics, located at the end of the chapter.
21. What is the wavelength of light falling on double slitsseparated by 2.00 µm if the third-order maximum is at an
angle of 60.0° ?
22. At what angle is the fourth-order maximum for thesituation in the preceding problem?
23. What is the highest-order maximum for 400-nm lightfalling on double slits separated by 25.0 µm ?
24. Find the largest wavelength of light falling on doubleslits separated by 1.20 µm for which there is a first-order
maximum. Is this in the visible part of the spectrum?
25. What is the smallest separation between two slits thatwill produce a second-order maximum for 720-nm redlight?
26. (a) What is the smallest separation between two slitsthat will produce a second-order maximum for any visiblelight? (b) For all visible light?


27. (a) If the first-order maximum for monochromaticlight falling on a double slit is at an angle of 10.0° , at
what angle is the second-order maximum? (b) What is theangle of the first minimum? (c) What is the highest-ordermaximum possible here?
28. Shown below is a double slit located a distance xfrom a screen, with the distance from the center of thescreen given by y. When the distance d between the slitsis relatively large, numerous bright spots appear, calledfringes. Show that, for small angles (where sin θ ≈ θ , with
θ in radians), the distance between fringes is given by
Δy = xλ/d


29. Using the result of the preceding problem, (a)calculate the distance between fringes for 633-nm lightfalling on double slits separated by 0.0800 mm, located3.00 m from a screen. (b) What would be the distancebetween fringes if the entire apparatus were submersed inwater, whose index of refraction is 1.33?
30. Using the result of the problem two problems prior,find the wavelength of light that produces fringes 7.50 mmapart on a screen 2.00 m from double slits separated by0.120 mm.
31. In a double-slit experiment, the fifth maximum is 2.8cm from the central maximum on a screen that is 1.5 maway from the slits. If the slits are 0.15 mm apart, what isthe wavelength of the light being used?
32. The source in Young’s experiment emits at twowavelengths. On the viewing screen, the fourth maximumfor one wavelength is located at the same spot as the fifth


Chapter 3 | Interference 139




maximum for the other wavelength. What is the ratio of thetwo wavelengths?
33. If 500-nm and 650-nm light illuminates two slits thatare separated by 0.50 mm, how far apart are the second-order maxima for these two wavelengths on a screen 2.0 maway?
34. Red light of wavelength of 700 nm falls on a doubleslit separated by 400 nm. (a) At what angle is the first-order maximum in the diffraction pattern? (b) What isunreasonable about this result? (c) Which assumptions areunreasonable or inconsistent?


3.3 Multiple-Slit Interference
35. Ten narrow slits are equally spaced 0.25 mm apartand illuminated with yellow light of wavelength 580 nm.(a) What are the angular positions of the third and fourthprincipal maxima? (b) What is the separation of thesemaxima on a screen 2.0 m from the slits?
36. The width of bright fringes can be calculated as theseparation between the two adjacent dark fringes on eitherside. Find the angular widths of the third- and fourth-orderbright fringes from the preceding problem.
37. For a three-slit interference pattern, find the ratio ofthe peak intensities of a secondary maximum to a principalmaximum.
38. What is the angular width of the central fringe ofthe interference pattern of (a) 20 slits separated by
d = 2.0 × 10−3mm ? (b) 50 slits with the same
separation? Assume that λ = 600 nm .


3.4 Interference in Thin Films
39. A soap bubble is 100 nm thick and illuminated bywhite light incident perpendicular to its surface. Whatwavelength and color of visible light is most constructivelyreflected, assuming the same index of refraction as water?
40. An oil slick on water is 120 nm thick and illuminatedby white light incident perpendicular to its surface. Whatcolor does the oil appear (what is the most constructivelyreflected wavelength), given its index of refraction is 1.40?
41. Calculate the minimum thickness of an oil slick onwater that appears blue when illuminated by white lightperpendicular to its surface. Take the blue wavelength to be470 nm and the index of refraction of oil to be 1.40.
42. Find the minimum thickness of a soap bubble thatappears red when illuminated by white light perpendicular


to its surface. Take the wavelength to be 680 nm, andassume the same index of refraction as water.
43. A film of soapy water ( n = 1.33 ) on top of a plastic
cutting board has a thickness of 233 nm. What color is moststrongly reflected if it is illuminated perpendicular to itssurface?
44. What are the three smallest non-zero thicknesses ofsoapy water ( n = 1.33 ) on Plexiglas if it appears green
(constructively reflecting 520-nm light) when illuminatedperpendicularly by white light?
45. Suppose you have a lens system that is to be usedprimarily for 700-nm red light. What is the second thinnestcoating of fluorite (magnesium fluoride) that would benonreflective for this wavelength?
46. (a) As a soap bubble thins it becomes dark, becausethe path length difference becomes small compared withthe wavelength of light and there is a phase shift at the topsurface. If it becomes dark when the path length differenceis less than one-fourth the wavelength, what is the thickestthe bubble can be and appear dark at all visiblewavelengths? Assume the same index of refraction aswater. (b) Discuss the fragility of the film considering thethickness found.
47. To save money on making military aircraft invisible toradar, an inventor decides to coat them with a nonreflectivematerial having an index of refraction of 1.20, which isbetween that of air and the surface of the plane. This, hereasons, should be much cheaper than designing Stealthbombers. (a) What thickness should the coating be toinhibit the reflection of 4.00-cm wavelength radar? (b)What is unreasonable about this result? (c) Whichassumptions are unreasonable or inconsistent?


3.5 The Michelson Interferometer
48. A Michelson interferometer has two equal arms. Amercury light of wavelength 546 nm is used for theinterferometer and stable fringes are found. One of the armsis moved by 1.5µm . How many fringes will cross the
observing field?
49. What is the distance moved by the traveling mirrorof a Michelson interferometer that corresponds to 1500fringes passing by a point of the observation screen?Assume that the interferometer is illuminated with a 606nm spectral line of krypton-86.
50. When the traveling mirror of a Michelson
interferometer is moved 2.40 × 10−5 m , 90 fringes pass
by a point on the observation screen. What is the


140 Chapter 3 | Interference


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




wavelength of the light used?
51. In a Michelson interferometer, light of wavelength632.8 nm from a He-Ne laser is used. When one of themirrors is moved by a distance D, 8 fringes move past thefield of view. What is the value of the distance D?
52. A chamber 5.0 cm long with flat, parallel windows atthe ends is placed in one arm of a Michelson interferometer(see below). The light used has a wavelength of 500 nmin a vacuum. While all the air is being pumped out of thechamber, 29 fringes pass by a point on the observationscreen. What is the refractive index of the air?


ADDITIONAL PROBLEMS
53. For 600-nm wavelength light and a slit separation of0.12 mm, what are the angular positions of the first andthird maxima in the double slit interference pattern?
54. If the light source in the preceding problem ischanged, the angular position of the third maximum isfound to be 0.57° . What is the wavelength of light being
used now?
55. Red light ( λ = 710. nm ) illuminates double slits
separated by a distance d = 0.150 mm. The screen and
the slits are 3.00 m apart. (a) Find the distance on the screenbetween the central maximum and the third maximum. (b)What is the distance between the second and the fourthmaxima?
56. Two sources as in phase and emit waves with
λ = 0.42 m . Determine whether constructive or
destructive interference occurs at points whose distancesfrom the two sources are (a) 0.84 and 0.42 m, (b) 0.21 and0.42 m, (c) 1.26 and 0.42 m, (d) 1.87 and 1.45 m, (e) 0.63and 0.84 m and (f) 1.47 and 1.26 m.
57. Two slits 4.0 × 10−6 m apart are illuminated by light
of wavelength 600 nm. What is the highest order fringe inthe interference pattern?
58. Suppose that the highest order fringe that can beobserved is the eighth in a double-slit experiment where550-nm wavelength light is used. What is the minimumseparation of the slits?
59. The interference pattern of a He-Ne laser light
(λ = 632.9 nm) passing through two slits 0.031 mm apart
is projected on a screen 10.0 m away. Determine thedistance between the adjacent bright fringes.
60. Young’s double-slit experiment is performed


immersed in water ( n = 1.333 ). The light source is a He-
Ne laser, λ = 632.9 nm in vacuum. (a) What is the
wavelength of this light in water? (b) What is the angle forthe third order maximum for two slits separated by 0.100mm.
61. A double-slit experiment is to be set up so that thebright fringes appear 1.27 cm apart on a screen 2.13 maway from the two slits. The light source was wavelength500 nm. What should be the separation between the twoslits?
62. An effect analogous to two-slit interference can occurwith sound waves, instead of light. In an open field, twospeakers placed 1.30 m apart are powered by a single-function generator producing sine waves at 1200-Hzfrequency. A student walks along a line 12.5 m away andparallel to the line between the speakers. She hears analternating pattern of loud and quiet, due to constructiveand destructive interference. What is (a) the wavelengthof this sound and (b) the distance between the centralmaximum and the first maximum (loud) position along thisline?
63. A hydrogen gas discharge lamp emits visible lightat four wavelengths, λ = 410, 434, 486, and 656 nm.
(a) If light from this lamp falls on a N slits separated by0.025 mm, how far from the central maximum are the thirdmaxima when viewed on a screen 2.0 m from the slits?(b) By what distance are the second and third maximaseparated for l = 486 nm ?
64. Monochromatic light of frequency 5.5 × 1014 Hz
falls on 10 slits separated by 0.020 mm. What is theseparation between the first and third maxima on a screenthat is 2.0 m from the slits?
65. Eight slits equally separated by 0.149 mm is uniformly


Chapter 3 | Interference 141




illuminated by a monochromatic light at λ = 523 nm .
What is the width of the central principal maximum on ascreen 2.35 m away?
66. Eight slits equally separated by 0.149 mm is uniformlyilluminated by a monochromatic light at λ = 523 nm .
What is the intensity of a secondary maxima compared tothat of the principal maxima?
67. A transparent film of thickness 250 nm and index ofrefraction of 1.40 is surrounded by air. What wavelength ina beam of white light at near-normal incidence to the filmundergoes destructive interference when reflected?
68. An intensity minimum is found for 450 nm lighttransmitted through a transparent film (n = 1.20) in air.
(a) What is minimum thickness of the film? (b) If thiswavelength is the longest for which the intensity minimumoccurs, what are the next three lower values of λ for which
this happens?
69. A thin film with n = 1.32 is surrounded by air. What
is the minimum thickness of this film such that thereflection of normally incident light with λ = 500 nm is
minimized?
70. Repeat your calculation of the previous problem withthe thin film placed on a flat glass ( n = 1.50 ) surface.
71. After a minor oil spill, a think film of oil ( n = 1.40 )
of thickness 450 nm floats on the water surface in a bay. (a)What predominant color is seen by a bird flying overhead?(b) What predominant color is seen by a seal swimmingunderwater?
72. A microscope slide 10 cm long is separated from aglass plate at one end by a sheet of paper. As shown below,the other end of the slide is in contact with the plate. heslide is illuminated from above by light from a sodiumlamp ( λ = 589 nm ), and 14 fringes per centimeter are seen
along the slide. What is the thickness of the piece of paper?


73. Suppose that the setup of the preceding problem isimmersed in an unknown liquid. If 18 fringes percentimeter are now seen along the slide, what is the indexof refraction of the liquid?


74. A thin wedge filled with air is produced when two flatglass plates are placed on top of one another and a slip ofpaper is inserted between them at one edge. Interferencefringes are observed when monochromatic light fallingvertically on the plates are seen in reflection. Is the firstfringe near the edge where the plates are in contact a brightfringe or a dark fringe? Explain.
75. Two identical pieces of rectangular plate glass are usedto measure the thickness of a hair. The glass plates arein direct contact at one edge and a single hair is placedbetween them hear the opposite edge. When illuminatedwith a sodium lamp ( λ = 589 nm ), the hair is seen
between the 180th and 181st dark fringes. What are thelower and upper limits on the hair’s diameter?
76. Two microscope slides made of glass are illuminatedby monochromatic ( λ = 589 nm ) light incident
perpendicularly. The top slide touches the bottom slide atone end and rests on a thin copper wire at the other end,forming a wedge of air. The diameter of the copper wire is
29.45 µm . How many bright fringes are seen across these
slides?
77. A good quality camera “lens” is actually a systemof lenses, rather than a single lens, but a side effect isthat a reflection from the surface of one lens can bouncearound many times within the system, creating artifactsin the photograph. To counteract this problem, one of thelenses in such a system is coated with a thin layer ofmaterial ( n = 1.28 ) on one side. The index of refraction of
the lens glass is 1.68. What is the smallest thickness of thecoating that reduces the reflection at 640 nm by destructiveinterference? (In other words, the coating’s effect is to beoptimized for λ = 640 nm .)
78. Constructive interference is observed from directlyabove an oil slick for wavelengths (in air) 440 nm and 616nm. The index of refraction of this oil is n = 1.54 . What
is the film’s minimum possible thickness?
79. A soap bubble is blown outdoors. What colors(indicate by wavelengths) of the reflected sunlight are seenenhanced? The soap bubble has index of refraction 1.36 andthickness 380 nm.
80. A Michelson interferometer with a He-Ne laser lightsource ( λ = 632.8 nm ) projects its interference pattern on
a screen. If the movable mirror is caused to move by
8.54 µm , how many fringes will be observed shifting
through a reference point on a screen?
81. An experimenter detects 251 fringes when themovable mirror in a Michelson interferometer is displaced.The light source used is a sodium lamp, wavelength 589


142 Chapter 3 | Interference


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




nm. By what distance did the movable mirror move?
82. A Michelson interferometer is used to measure thewavelength of light put through it. When the movablemirror is moved by exactly 0.100 mm, the number offringes observed moving through is 316. What is thewavelength of the light?
83. A 5.08-cm-long rectangular glass chamber is insertedinto one arm of a Michelson interferometer using a 633-nmlight source. This chamber is initially filled with air
(n = 1.000293) at standard atmospheric pressure but the
air is gradually pumped out using a vacuum pump untila near perfect vacuum is achieved. How many fringes areobserved moving by during the transition?
84. Into one arm of a Michelson interferometer, a plasticsheet of thickness 75 µm is inserted, which causes a shift
in the interference pattern by 86 fringes. The light sourcehas wavelength of 610 nm in air. What is the index ofrefraction of this plastic?
85. The thickness of an aluminum foil is measured usinga Michelson interferometer that has its movable mirrormounted on a micrometer. There is a difference of 27fringes in the observed interference pattern when themicrometer clamps down on the foil compared to when the


micrometer is empty. Calculate the thickness of the foil?
86. The movable mirror of a Michelson interferometeris attached to one end of a thin metal rod of length 23.3mm. The other end of the rod is anchored so it does notmove. As the temperature of the rod changes from 15 °C
to 25 C , a change of 14 fringes is observed. The light
source is a He Ne laser, λ = 632.8 nm . What is the change
in length of the metal bar, and what is its thermal expansioncoefficient?
87. In a thermally stabilized lab, a Michelsoninterferometer is used to monitor the temperature to ensureit stays constant. The movable mirror is mounted on theend of a 1.00-m-long aluminum rod, held fixed at the otherend. The light source is a He Ne laser, λ = 632.8 nm . The
resolution of this apparatus corresponds to the temperaturedifference when a change of just one fringe is observed.What is this temperature difference?
88. A 65-fringe shift results in a Michelson interferometerwhen a 42.0-µm film made of an unknown material is
placed in one arm. The light source has wavelength 632.9nm. Identify the material using the indices of refractionfound in Table 1.1.


CHALLENGE PROBLEMS
89. Determine what happens to the double-slitinterference pattern if one of the slits is covered with a thin,transparent film whose thickness is λ/[2(n − 1)] , where λ
is the wavelength of the incident light and n is the index ofrefraction of the film.
90. Fifty-one narrow slits are equally spaced andseparated by 0.10 mm. The slits are illuminated by bluelight of wavelength 400 nm. What is angular position ofthe twenty-fifth secondary maximum? What is its peakintensity in comparison with that of the primarymaximum?
91. A film of oil on water will appear dark when it isvery thin, because the path length difference becomes smallcompared with the wavelength of light and there is a phaseshift at the top surface. If it becomes dark when the pathlength difference is less than one-fourth the wavelength,what is the thickest the oil can be and appear dark at allvisible wavelengths? Oil has an index of refraction of 1.40.
92. Figure 3.14 shows two glass slides illuminated bymonochromatic light incident perpendicularly. The top


slide touches the bottom slide at one end and rests on a0.100-mm-diameter hair at the other end, forming a wedgeof air. (a) How far apart are the dark bands, if the slidesare 7.50 cm long and 589-nm light is used? (b) Is there anydifference if the slides are made from crown or flint glass?Explain.
93. Figure 3.14 shows two 7.50-cm-long glass slidesilluminated by pure 589-nm wavelength light incidentperpendicularly. The top slide touches the bottom slide atone end and rests on some debris at the other end, forminga wedge of air. How thick is the debris, if the dark bandsare 1.00 mm apart?
94. A soap bubble is 100 nm thick and illuminated bywhite light incident at a 45° angle to its surface. What
wavelength and color of visible light is most constructivelyreflected, assuming the same index of refraction as water?
95. An oil slick on water is 120 nm thick and illuminatedby white light incident at a 45° angle to its surface. What
color does the oil appear (what is the most constructivelyreflected wavelength), given its index of refraction is 1.40?


Chapter 3 | Interference 143




144 Chapter 3 | Interference


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




4 | DIFFRACTION


Figure 4.1 A steel ball bearing illuminated by a laser does not cast a sharp, circular shadow. Instead, a series of diffractionfringes and a central bright spot are observed. Known as Poisson’s spot, the effect was first predicted by Augustin-Jean Fresnel(1788–1827) as a consequence of diffraction of light waves. Based on principles of ray optics, Siméon-Denis Poisson(1781–1840) argued against Fresnel’s prediction. (credit: modification of work by Harvard Natural Science LectureDemonstrations)


Chapter Outline
4.1 Single-Slit Diffraction
4.2 Intensity in Single-Slit Diffraction
4.3 Double-Slit Diffraction
4.4 Diffraction Gratings
4.5 Circular Apertures and Resolution
4.6 X-Ray Diffraction
4.7 Holography


Introduction
Imagine passing a monochromatic light beam through a narrow opening—a slit just a little wider than the wavelength ofthe light. Instead of a simple shadow of the slit on the screen, you will see that an interference pattern appears, even thoughthere is only one slit.
In the chapter on interference, we saw that you need two sources of waves for interference to occur. How can there be aninterference pattern when we have only one slit? In The Nature of Light, we learned that, due to Huygens’s principle, wecan imagine a wave front as equivalent to infinitely many point sources of waves. Thus, a wave from a slit can behave not asone wave but as an infinite number of point sources. These waves can interfere with each other, resulting in an interferencepattern without the presence of a second slit. This phenomenon is called diffraction.
Another way to view this is to recognize that a slit has a small but finite width. In the preceding chapter, we implicitlyregarded slits as objects with positions but no size. The widths of the slits were considered negligible. When the slits havefinite widths, each point along the opening can be considered a point source of light—a foundation of Huygens’s principle.Because real-world optical instruments must have finite apertures (otherwise, no light can enter), diffraction plays a majorrole in the way we interpret the output of these optical instruments. For example, diffraction places limits on our ability to


Chapter 4 | Diffraction 145




resolve images or objects. This is a problem that we will study later in this chapter.
4.1 | Single-Slit Diffraction


Learning Objectives
By the end of this section, you will be able to:
• Explain the phenomenon of diffraction and the conditions under which it is observed
• Describe diffraction through a single slit


After passing through a narrow aperture (opening), a wave propagating in a specific direction tends to spread out. Forexample, sound waves that enter a room through an open door can be heard even if the listener is in a part of the roomwhere the geometry of ray propagation dictates that there should only be silence. Similarly, ocean waves passing through anopening in a breakwater can spread throughout the bay inside. (Figure 4.2). The spreading and bending of sound and oceanwaves are two examples of diffraction, which is the bending of a wave around the edges of an opening or an obstacle—aphenomenon exhibited by all types of waves.


Figure 4.2 Because of the diffraction of waves, ocean wavesentering through an opening in a breakwater can spreadthroughout the bay. (credit: modification of map data fromGoogle Earth)


The diffraction of sound waves is apparent to us because wavelengths in the audible region are approximately the samesize as the objects they encounter, a condition that must be satisfied if diffraction effects are to be observed easily. Sincethe wavelengths of visible light range from approximately 390 to 770 nm, most objects do not diffract light significantly.However, situations do occur in which apertures are small enough that the diffraction of light is observable. For example,if you place your middle and index fingers close together and look through the opening at a light bulb, you can see a ratherclear diffraction pattern, consisting of light and dark lines running parallel to your fingers.
Diffraction through a Single Slit
Light passing through a single slit forms a diffraction pattern somewhat different from those formed by double slits ordiffraction gratings, which we discussed in the chapter on interference. Figure 4.3 shows a single-slit diffraction pattern.Note that the central maximum is larger than maxima on either side and that the intensity decreases rapidly on either side.In contrast, a diffraction grating (Diffraction Gratings) produces evenly spaced lines that dim slowly on either side of thecenter.


146 Chapter 4 | Diffraction


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 4.3 Single-slit diffraction pattern. (a) Monochromaticlight passing through a single slit has a central maximum andmany smaller and dimmer maxima on either side. The centralmaximum is six times higher than shown. (b) The diagramshows the bright central maximum, and the dimmer and thinnermaxima on either side.


The analysis of single-slit diffraction is illustrated in Figure 4.4. Here, the light arrives at the slit, illuminating it uniformlyand is in phase across its width. We then consider light propagating onwards from different parts of the same slit. Accordingto Huygens’s principle, every part of the wave front in the slit emits wavelets, as we discussed in The Nature of Light.These are like rays that start out in phase and head in all directions. (Each ray is perpendicular to the wave front ofa wavelet.) Assuming the screen is very far away compared with the size of the slit, rays heading toward a commondestination are nearly parallel. When they travel straight ahead, as in part (a) of the figure, they remain in phase, and weobserve a central maximum. However, when rays travel at an angle θ relative to the original direction of the beam, each ray
travels a different distance to a common location, and they can arrive in or out of phase. In part (b), the ray from the bottomtravels a distance of one wavelength λ farther than the ray from the top. Thus, a ray from the center travels a distance
λ/2 less than the one at the bottom edge of the slit, arrives out of phase, and interferes destructively. A ray from slightly
above the center and one from slightly above the bottom also cancel one another. In fact, each ray from the slit interferesdestructively with another ray. In other words, a pair-wise cancellation of all rays results in a dark minimum in intensity atthis angle. By symmetry, another minimum occurs at the same angle to the right of the incident direction (toward the bottomof the figure) of the light.


Chapter 4 | Diffraction 147




Figure 4.4 Light passing through a single slit is diffracted in all directions and may interfere constructively or destructively,depending on the angle. The difference in path length for rays from either side of the slit is seen to be D sin θ .


At the larger angle shown in part (c), the path lengths differ by 3λ/2 for rays from the top and bottom of the slit. One
ray travels a distance λ different from the ray from the bottom and arrives in phase, interfering constructively. Two rays,
each from slightly above those two, also add constructively. Most rays from the slit have another ray to interfere withconstructively, and a maximum in intensity occurs at this angle. However, not all rays interfere constructively for thissituation, so the maximum is not as intense as the central maximum. Finally, in part (d), the angle shown is large enough toproduce a second minimum. As seen in the figure, the difference in path length for rays from either side of the slit is D sin
θ , and we see that a destructive minimum is obtained when this distance is an integral multiple of the wavelength.
Thus, to obtain destructive interference for a single slit,


(4.1)D sin θ = mλ, for m = ± 1, ± 2, ± 3, ...(destructive),


where D is the slit width, λ is the light’s wavelength, θ is the angle relative to the original direction of the light, and m
is the order of the minimum. Figure 4.5 shows a graph of intensity for single-slit interference, and it is apparent that themaxima on either side of the central maximum are much less intense and not as wide. This effect is explored in Double-Slit Diffraction.


148 Chapter 4 | Diffraction


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 4.5 A graph of single-slit diffraction intensity showingthe central maximum to be wider and much more intense thanthose to the sides. In fact, the central maximum is six timeshigher than shown here.


Example 4.1
Calculating Single-Slit Diffraction
Visible light of wavelength 550 nm falls on a single slit and produces its second diffraction minimum at an angleof 45.0° relative to the incident direction of the light, as in Figure 4.6. (a) What is the width of the slit? (b) At
what angle is the first minimum produced?


Figure 4.6 In this example, we analyze a graph of the single-slit diffraction pattern.


Strategy
From the given information, and assuming the screen is far away from the slit, we can use the equation
D sin θ = mλ first to find D, and again to find the angle for the first minimum θ1.


Chapter 4 | Diffraction 149




4.1


Solutiona. We are given that λ = 550 nm , m = 2 , and θ2 = 45.0° . Solving the equation D sin θ = mλ for D and
substituting known values gives


D = mλ
sin θ2


= 2(550 nm)
sin 45.0°


= 1100 × 10
−9 m


0.707
= 1.56 × 10−6 m.


b. Solving the equation D sin θ = mλ for sin θ1 and substituting the known values gives
sin θ1 =



D


= 1(550 × 10
−9 m)


1.56 × 10−6 m
.


Thus the angle θ1 is
θ1 = sin


−10.354 = 20.7°.


Significance
We see that the slit is narrow (it is only a few times greater than the wavelength of light). This is consistent withthe fact that light must interact with an object comparable in size to its wavelength in order to exhibit significantwave effects such as this single-slit diffraction pattern. We also see that the central maximum extends 20.7° on
either side of the original beam, for a width of about 41° . The angle between the first and second minima is only
about 24° (45.0° − 20.7°) . Thus, the second maximum is only about half as wide as the central maximum.


Check Your Understanding Suppose the slit width in Example 4.1 is increased to 1.8 × 10−6 m.
What are the new angular positions for the first, second, and third minima? Would a fourth minimum exist?


4.2 | Intensity in Single-Slit Diffraction
Learning Objectives


By the end of this section, you will be able to:
• Calculate the intensity relative to the central maximum of the single-slit diffraction peaks
• Calculate the intensity relative to the central maximum of an arbitrary point on the screen


To calculate the intensity of the diffraction pattern, we follow the phasor method used for calculations with ac circuits inAlternating-Current Circuits (http://cnx.org/content/m58485/latest/) . If we consider that there are N Huygenssources across the slit shown in Figure 4.4, with each source separated by a distance D/N from its adjacent neighbors,the path difference between waves from adjacent sources reaching the arbitrary point P on the screen is (D/N) sin θ. This
distance is equivalent to a phase difference of (2πD/λN) sin θ. The phasor diagram for the waves arriving at the point
whose angular position is θ is shown in Figure 4.7. The amplitude of the phasor for each Huygens wavelet is ΔE0, the
amplitude of the resultant phasor is E, and the phase difference between the wavelets from the first and the last sources is


ϕ = ⎛⎝

λ

⎠D sin θ.


With N → ∞ , the phasor diagram approaches a circular arc of length NΔE0 and radius r. Since the length of the arc
is NΔE0 for any ϕ , the radius r of the arc must decrease as ϕ increases (or equivalently, as the phasors form tighter
spirals).


150 Chapter 4 | Diffraction


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 4.7 (a) Phasor diagram corresponding to the angularposition θ in the single-slit diffraction pattern. The phase
difference between the wavelets from the first and last sources is
ϕ = (2π/λ)D sin θ . (b) The geometry of the phasor diagram.


The phasor diagram for ϕ = 0 (the center of the diffraction pattern) is shown in Figure 4.8(a) using N = 30 . In this
case, the phasors are laid end to end in a straight line of length NΔE0, the radius r goes to infinity, and the resultant
has its maximum value E = NΔE0. The intensity of the light can be obtained using the relation I = 12cε0E2 from
Electromagnetic Waves (http://cnx.org/content/m58495/latest/) . The intensity of the maximum is then


I0 =
1
2
cε0 (NΔE0)


2 = 1
2µ0 c



⎝NΔE0




2,


where ε0 = 1/µ0 c2 . The phasor diagrams for the first two zeros of the diffraction pattern are shown in parts (b) and (d) of
the figure. In both cases, the phasors add to zero, after rotating through ϕ = 2π rad for m = 1 and 4π rad for m = 2 .


Figure 4.8 Phasor diagrams (with 30 phasors) for various points on the single-slit diffractionpattern. Multiple rotations around a given circle have been separated slightly so that the phasors canbe seen. (a) Central maximum, (b) first minimum, (c) first maximum beyond central maximum, (d)second minimum, and (e) second maximum beyond central maximum.


The next two maxima beyond the central maxima are represented by the phasor diagrams of parts (c) and (e). In part (c),the phasors have rotated through ϕ = 3π rad and have formed a resultant phasor of magnitude E1 . The length of the arc
formed by the phasors is NΔE0. Since this corresponds to 1.5 rotations around a circle of diameter E1 , we have


3
2
πE1 = NΔE0,


Chapter 4 | Diffraction 151




so
E1 =


2NΔE0


and
I1 =


1
2µ0 c


E1
2 =


4(NΔE0)
2



⎝9π


2⎞


⎝2µ0 c





= 0.045I0,


where
I0 =


(NΔE0)
2


2µ0 c
.


In part (e), the phasors have rotated through ϕ = 5π rad, corresponding to 2.5 rotations around a circle of diameter E2
and arc length NΔE0. This results in I2 = 0.016I0 . The proof is left as an exercise for the student (Exercise 4.119).
These two maxima actually correspond to values of ϕ slightly less than 3π rad and 5π rad. Since the total length of the
arc of the phasor diagram is always NΔE0, the radius of the arc decreases as ϕ increases. As a result, E1 and E2 turn
out to be slightly larger for arcs that have not quite curled through 3π rad and 5π rad, respectively. The exact values of ϕ
for the maxima are investigated in Exercise 4.120. In solving that problem, you will find that they are less than, but veryclose to, ϕ = 3π, 5π, 7π, … rad.
To calculate the intensity at an arbitrary point P on the screen, we return to the phasor diagram of Figure 4.7. Since the arcsubtends an angle ϕ at the center of the circle,


NΔE0 = rϕ


and
sin


ϕ
2

⎠ =


E
2r


.


where E is the amplitude of the resultant field. Solving the second equation for E and then substituting r from the firstequation, we find
E = 2r sin


ϕ
2


= 2NΔEo
ϕ


sin
ϕ
2
.


Now defining


(4.2)β = ϕ
2


= πD sin θ
λ


we obtain


(4.3)E = NΔE0 sin ββ


This equation relates the amplitude of the resultant field at any point in the diffraction pattern to the amplitude NΔE0 at
the central maximum. The intensity is proportional to the square of the amplitude, so


152 Chapter 4 | Diffraction


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




(4.4)
I = I0


sin β
β



2


where I0 = ⎛⎝NΔE0⎞⎠2/2µ0 c is the intensity at the center of the pattern.
For the central maximum, ϕ = 0 , β is also zero and we see from l’Hôpital’s rule that limβ → 0 ⎛⎝sin β/β⎞⎠ = 1, so that
limϕ → 0 I = I0. For the next maximum, ϕ = 3π rad, we have β = 3π/2 rad and when substituted into Equation 4.4,
it yields


I1 = I0


sin 3π/2
3π/2





2
= 0.045I0,


in agreement with what we found earlier in this section using the diameters and circumferences of phasor diagrams.Substituting ϕ = 5π rad into Equation 4.4 yields a similar result for I2 .
A plot of Equation 4.4 is shown in Figure 4.9 and directly below it is a photograph of an actual diffraction pattern.Notice that the central peak is much brighter than the others, and that the zeros of the pattern are located at those pointswhere sin β = 0, which occurs when β = mπ rad. This corresponds to


πD sin θ
λ


= mπ,


or
D sin θ = mλ,


which is Equation 4.1.


Figure 4.9 (a) The calculated intensity distribution of a single-slit diffraction pattern. (b) Theactual diffraction pattern.


Chapter 4 | Diffraction 153




4.2


Example 4.2
Intensity in Single-Slit Diffraction
Light of wavelength 550 nm passes through a slit of width 2.00 µm and produces a diffraction pattern similar
to that shown in Figure 4.9. (a) Find the locations of the first two minima in terms of the angle from the centralmaximum and (b) determine the intensity relative to the central maximum at a point halfway between these twominima.
Strategy
The minima are given by Equation 4.1, D sin θ = mλ . The first two minima are for m = 1 and m = 2.
Equation 4.4 and Equation 4.2 can be used to determine the intensity once the angle has been worked out.
Solution


a. Solving Equation 4.1 for θ gives us θm = sin−1(mλ/D), so that
θ1 = sin


−1




(+1)⎛⎝550 × 10


−9 m⎞⎠


2.00 × 10−6 m





⎟ = + 16.0°


and
θ2 = sin


−1




(+2)⎛⎝550 × 10


−9 m⎞⎠


2.00 × 10−6 m





⎟ = + 33.4°.


b. The halfway point between θ1 and θ2 is
θ = ⎛⎝θ1 + θ2



⎠/2 = (16.0° + 33.4°)/2 = 24.7°.


Equation 4.2 gives
β = πD sin θ


λ
=


π⎛⎝2.00 × 10
−6 m⎞⎠ sin(24.7°)



⎝550 × 10


−9 m⎞⎠
= 1.52π or 4.77 rad.


From Equation 4.4, we can calculate
I
Io


=


sin β
β



2


= ⎛⎝
sin (4.77)


4.77



2
= ⎛⎝


−0.9985
4.77





2
= 0.044.


Significance
This position, halfway between two minima, is very close to the location of the maximum, expected near
β = 3π/2, or 1.5π .


Check Your Understanding For the experiment in Example 4.2, at what angle from the center is thethird maximum and what is its intensity relative to the central maximum?


If the slit width D is varied, the intensity distribution changes, as illustrated in Figure 4.10. The central peak is distributedover the region from sin θ = −λ/D to sin θ = + λ/D . For small θ , this corresponds to an angular width Δθ ≈ 2λ/D.
Hence, an increase in the slit width results in a decrease in the width of the central peak. For a slit with D ≫ λ, the
central peak is very sharp, whereas if D ≈ λ , it becomes quite broad.


154 Chapter 4 | Diffraction


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 4.10 Single-slit diffraction patterns for various slit widths. As the slit width D increases from D = λ to 5λ and then to
10λ , the width of the central peak decreases as the angles for the first minima decrease as predicted by Equation 4.1.


A diffraction experiment in optics can require a lot of preparation but this simulation(https://openstaxcollege.org/l/21diffrexpoptsi) by Andrew Duffy offers not only a quick set up but also theability to change the slit width instantly. Run the simulation and select “Single slit.” You can adjust the slit widthand see the effect on the diffraction pattern on a screen and as a graph.


4.3 | Double-Slit Diffraction
Learning Objectives


By the end of this section, you will be able to:
• Describe the combined effect of interference and diffraction with two slits, each with finite width
• Determine the relative intensities of interference fringes within a diffraction pattern
• Identify missing orders, if any


When we studied interference in Young’s double-slit experiment, we ignored the diffraction effect in each slit. We assumedthat the slits were so narrow that on the screen you saw only the interference of light from just two point sources. If the slitis smaller than the wavelength, then Figure 4.10(a) shows that there is just a spreading of light and no peaks or troughson the screen. Therefore, it was reasonable to leave out the diffraction effect in that chapter. However, if you make the slitwider, Figure 4.10(b) and (c) show that you cannot ignore diffraction. In this section, we study the complications to thedouble-slit experiment that arise when you also need to take into account the diffraction effect of each slit.
To calculate the diffraction pattern for two (or any number of) slits, we need to generalize the method we just used for asingle slit. That is, across each slit, we place a uniform distribution of point sources that radiate Huygens wavelets, andthen we sum the wavelets from all the slits. This gives the intensity at any point on the screen. Although the details of thatcalculation can be complicated, the final result is quite simple:


Two-Slit Diffraction Pattern
The diffraction pattern of two slits of width D that are separated by a distance d is the interference pattern of two pointsources separated by d multiplied by the diffraction pattern of a slit of width D.


In other words, the locations of the interference fringes are given by the equation d sin θ = mλ , the same as when we
considered the slits to be point sources, but the intensities of the fringes are now reduced by diffraction effects, accordingto Equation 4.4. [Note that in the chapter on interference, we wrote d sin θ = mλ and used the integer m to refer to
interference fringes. Equation 4.1 also uses m, but this time to refer to diffraction minima. If both equations are usedsimultaneously, it is good practice to use a different variable (such as n) for one of these integers in order to keep themdistinct.]
Interference and diffraction effects operate simultaneously and generally produce minima at different angles. This gives riseto a complicated pattern on the screen, in which some of the maxima of interference from the two slits are missing if the


Chapter 4 | Diffraction 155




maximum of the interference is in the same direction as the minimum of the diffraction. We refer to such a missing peak asa missing order. One example of a diffraction pattern on the screen is shown in Figure 4.11. The solid line with multiplepeaks of various heights is the intensity observed on the screen. It is a product of the interference pattern of waves fromseparate slits and the diffraction of waves from within one slit.


Figure 4.11 Diffraction from a double slit. The purple line with peaks of the same height arefrom the interference of the waves from two slits; the blue line with one big hump in the middleis the diffraction of waves from within one slit; and the thick red line is the product of the two,which is the pattern observed on the screen. The plot shows the expected result for a slit width
D = 2λ and slit separation d = 6λ . The maximum of m = ± 3 order for the interference is
missing because the minimum of the diffraction occurs in the same direction.


Example 4.3
Intensity of the Fringes
Figure 4.11 shows that the intensity of the fringe for m = 3 is zero, but what about the other fringes? Calculate
the intensity for the fringe at m = 1 relative to I0, the intensity of the central peak.
Strategy
Determine the angle for the double-slit interference fringe, using the equation from Interference, then determinethe relative intensity in that direction due to diffraction by using Equation 4.4.
Solution
From the chapter on interference, we know that the bright interference fringes occur at d sin θ = mλ , or


sin θ = mλ
d


.


From Equation 4.4,
I = I0


sin β
β



2


, where β =
ϕ
2


= πD sin θ
λ


.


Substituting from above,
β = πD sin θ


λ
= πD


λ
· mλ
d


= mπD
d


.


For D = 2λ , d = 6λ , and m = 1 ,
β = (1)π(2λ)


(6λ)
= π


3
.


156 Chapter 4 | Diffraction


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




4.3


Then, the intensity is
I = I0


sin β
β



2


= I0


sin (π/3)


π/3



2
= 0.684I0.


Significance
Note that this approach is relatively straightforward and gives a result that is almost exactly the same as the morecomplicated analysis using phasors to work out the intensity values of the double-slit interference (thin line inFigure 4.11). The phasor approach accounts for the downward slope in the diffraction intensity (blue line) sothat the peak near m = 1 occurs at a value of θ ever so slightly smaller than we have shown here.


Example 4.4
Two-Slit Diffraction
Suppose that in Young’s experiment, slits of width 0.020 mm are separated by 0.20 mm. If the slits are illuminatedby monochromatic light of wavelength 500 nm, how many bright fringes are observed in the central peak of thediffraction pattern?
Solution
From Equation 4.1, the angular position of the first diffraction minimum is
θ ≈ sin θ = λ


D
= 5.0 × 10


−7 m
2.0 × 10−5 m


= 2.5 × 10−2 rad.


Using sin θ = mλ for θ = 2.5 × 10−2 rad , we find
m = d sin θ


λ
=


(0.20 mm)⎛⎝2.5 × 10
−2 rad⎞⎠



⎝5.0 × 10


−7 m⎞⎠
= 10,


which is the maximum interference order that fits inside the central peak. We note that m = ± 10 are missing
orders as θ matches exactly. Accordingly, we observe bright fringes for


m = −9, −8, −7, −6, −5, −4, −3, −2, −1, 0, + 1, + 2, + 3, + 4, + 5, + 6, + 7, + 8, and + 9


for a total of 19 bright fringes.
Check Your Understanding For the experiment in Example 4.4, show that m = 20 is also a missing


order.


Explore the effects of double-slit diffraction. In this simulation (https://openstaxcollege.org/l/21doubslitdiff) written by Fu-Kwun Hwang, select N = 2 using the slider and see what happens when you
control the slit width, slit separation and the wavelength. Can you make an order go “missing?”


4.4 | Diffraction Gratings
Learning Objectives


By the end of this section, you will be able to:
• Discuss the pattern obtained from diffraction gratings
• Explain diffraction grating effects


Chapter 4 | Diffraction 157




Analyzing the interference of light passing through two slits lays out the theoretical framework of interference and gives usa historical insight into Thomas Young’s experiments. However, most modern-day applications of slit interference use notjust two slits but many, approaching infinity for practical purposes. The key optical element is called a diffraction grating,an important tool in optical analysis.
Diffraction Gratings: An Infinite Number of Slits
The analysis of multi-slit interference in Interference allows us to consider what happens when the number of slits Napproaches infinity. Recall that N – 2 secondary maxima appear between the principal maxima. We can see there will be
an infinite number of secondary maxima that appear, and an infinite number of dark fringes between them. This makesthe spacing between the fringes, and therefore the width of the maxima, infinitesimally small. Furthermore, because the
intensity of the secondary maxima is proportional to 1/N 2 , it approaches zero so that the secondary maxima are no longer
seen. What remains are only the principal maxima, now very bright and very narrow (Figure 4.12).


Figure 4.12 (a) Intensity of light transmitted through a large number of slits. When Napproaches infinity, only the principal maxima remain as very bright and very narrow lines. (b)A laser beam passed through a diffraction grating. (credit b: modification of work by SebastianStapelberg)


In reality, the number of slits is not infinite, but it can be very large—large enough to produce the equivalent effect. A primeexample is an optical element called a diffraction grating. A diffraction grating can be manufactured by carving glasswith a sharp tool in a large number of precisely positioned parallel lines, with untouched regions acting like slits (Figure4.13). This type of grating can be photographically mass produced rather cheaply. Because there can be over 1000 lines permillimeter across the grating, when a section as small as a few millimeters is illuminated by an incoming ray, the number ofilluminated slits is effectively infinite, providing for very sharp principal maxima.


158 Chapter 4 | Diffraction


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 4.13 A diffraction grating can be manufactured by carving glass with asharp tool in a large number of precisely positioned parallel lines.


Diffraction gratings work both for transmission of light, as in Figure 4.14, and for reflection of light, as on butterfly wingsand the Australian opal in Figure 4.15. Natural diffraction gratings also occur in the feathers of certain birds such as thehummingbird. Tiny, finger-like structures in regular patterns act as reflection gratings, producing constructive interferencethat gives the feathers colors not solely due to their pigmentation. This is called iridescence.


Figure 4.14 (a) Light passing through a diffraction grating isdiffracted in a pattern similar to a double slit, with bright regionsat various angles. (b) The pattern obtained for white light incidenton a grating. The central maximum is white, and the higher-ordermaxima disperse white light into a rainbow of colors.


Chapter 4 | Diffraction 159




Figure 4.15 (a) This Australian opal and (b) butterfly wings have rows ofreflectors that act like reflection gratings, reflecting different colors at differentangles. (credit b: modification of work by “whologwhy”/Flickr)
Applications of Diffraction Gratings
Where are diffraction gratings used in applications? Diffraction gratings are commonly used for spectroscopic dispersionand analysis of light. What makes them particularly useful is the fact that they form a sharper pattern than double slitsdo. That is, their bright fringes are narrower and brighter while their dark regions are darker. Diffraction gratings are keycomponents of monochromators used, for example, in optical imaging of particular wavelengths from biological or medicalsamples. A diffraction grating can be chosen to specifically analyze a wavelength emitted by molecules in diseased cells ina biopsy sample or to help excite strategic molecules in the sample with a selected wavelength of light. Another vital use isin optical fiber technologies where fibers are designed to provide optimum performance at specific wavelengths. A range ofdiffraction gratings are available for selecting wavelengths for such use.
Example 4.5


Calculating Typical Diffraction Grating Effects
Diffraction gratings with 10,000 lines per centimeter are readily available. Suppose you have one, and you senda beam of white light through it to a screen 2.00 m away. (a) Find the angles for the first-order diffraction of theshortest and longest wavelengths of visible light (380 and 760 nm, respectively). (b) What is the distance betweenthe ends of the rainbow of visible light produced on the screen for first-order interference? (See Figure 4.16.)


160 Chapter 4 | Diffraction


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 4.16 (a) The diffraction grating considered in thisexample produces a rainbow of colors on a screen a distance
x = 2.00 m from the grating. The distances along the screen
are measured perpendicular to the x-direction. In other words,the rainbow pattern extends out of the page.(b) In a bird’s-eye view, the rainbow pattern can be seen on atable where the equipment is placed.


Strategy
Once a value for the diffraction grating’s slit spacing d has been determined, the angles for the sharp lines can befound using the equation


d sin θ = mλ for m = 0, ± 1, ± 2, ... .


Since there are 10,000 lines per centimeter, each line is separated by 1/10,000 of a centimeter. Once we know theangles, we an find the distances along the screen by using simple trigonometry.
Solution


a. The distance between slits is d = (1 cm)/10, 000 = 1.00 × 10−4 cm or 1.00 × 10−6 m. Let us call
the two angles θV for violet (380 nm) and θR for red (760 nm). Solving the equation
d sin θV = mλ for sin θV,


sin θV =
mλV
d


,


where m = 1 for the first-order and λV = 380 nm = 3.80 × 10−7 m. Substituting these values gives
sin θV =


3.80 × 10−7 m
1.00 × 10−6 m


= 0.380.


Thus the angle θV is
θV = sin


−1 0.380 = 22.33°.


Chapter 4 | Diffraction 161




4.4


Similarly,
sin θR =


7.60 × 10−7 m
1.00 × 10−6 m


= 0.760.


Thus the angle θR is
θR = sin


−1 0.760 = 49.46°.


Notice that in both equations, we reported the results of these intermediate calculations to four significantfigures to use with the calculation in part (b).
b. The distances on the secreen are labeled yV and yR in Figure 4.16. Notice that tan θ = y/x. We can


solve for yV and yR. That is,
yV = x tan θV = (2.00 m)(tan 22.33°) = 0.815 m


and
yR = x tan θR = (2.00 m)(tan 49.46°) = 2.338 m.


The distance between them is therefore
yR − yV = 1.523 m.


Significance
The large distance between the red and violet ends of the rainbow produced from the white light indicates thepotential this diffraction grating has as a spectroscopic tool. The more it can spread out the wavelengths (greaterdispersion), the more detail can be seen in a spectrum. This depends on the quality of the diffraction grating—itmust be very precisely made in addition to having closely spaced lines.


Check Your Understanding If the line spacing of a diffraction grating d is not precisely known, we canuse a light source with a well-determined wavelength to measure it. Suppose the first-order constructive fringeof the Hβ emission line of hydrogen (λ = 656.3 nm) is measured at 11.36° using a spectrometer with a
diffraction grating. What is the line spacing of this grating?


Take the same simulation (https://openstaxcollege.org/l/21doubslitdiff) we used for double-slitdiffraction and try increasing the number of slits from N = 2 to N = 3, 4, 5... . The primary peaks become
sharper, and the secondary peaks become less and less pronounced. By the time you reach the maximum numberof N = 20 , the system is behaving much like a diffraction grating.


4.5 | Circular Apertures and Resolution
Learning Objectives


By the end of this section, you will be able to:
• Describe the diffraction limit on resolution
• Describe the diffraction limit on beam propagation


Light diffracts as it moves through space, bending around obstacles, interfering constructively and destructively. This canbe used as a spectroscopic tool—a diffraction grating disperses light according to wavelength, for example, and is used toproduce spectra—but diffraction also limits the detail we can obtain in images.


162 Chapter 4 | Diffraction


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 4.17(a) shows the effect of passing light through a small circular aperture. Instead of a bright spot with sharp edges,we obtain a spot with a fuzzy edge surrounded by circles of light. This pattern is caused by diffraction, similar to thatproduced by a single slit. Light from different parts of the circular aperture interferes constructively and destructively. Theeffect is most noticeable when the aperture is small, but the effect is there for large apertures as well.


Figure 4.17 (a) Monochromatic light passed through a small circular aperture produces thisdiffraction pattern. (b) Two point-light sources that are close to one another produce overlapping imagesbecause of diffraction. (c) If the sources are closer together, they cannot be distinguished or resolved.


How does diffraction affect the detail that can be observed when light passes through an aperture? Figure 4.17(b) showsthe diffraction pattern produced by two point-light sources that are close to one another. The pattern is similar to that for asingle point source, and it is still possible to tell that there are two light sources rather than one. If they are closer together,as in Figure 4.17(c), we cannot distinguish them, thus limiting the detail or resolution we can obtain. This limit is aninescapable consequence of the wave nature of light.
Diffraction limits the resolution in many situations. The acuity of our vision is limited because light passes through thepupil, which is the circular aperture of the eye. Be aware that the diffraction-like spreading of light is due to the limiteddiameter of a light beam, not the interaction with an aperture. Thus, light passing through a lens with a diameter D showsthis effect and spreads, blurring the image, just as light passing through an aperture of diameter D does. Thus, diffractionlimits the resolution of any system having a lens or mirror. Telescopes are also limited by diffraction, because of the finitediameter D of the primary mirror.
Just what is the limit? To answer that question, consider the diffraction pattern for a circular aperture, which has a centralmaximum that is wider and brighter than the maxima surrounding it (similar to a slit) (Figure 4.18(a)). It can be shownthat, for a circular aperture of diameter D, the first minimum in the diffraction pattern occurs at θ = 1.22λ/D (providing
the aperture is large compared with the wavelength of light, which is the case for most optical instruments). The acceptedcriterion for determining the diffraction limit to resolution based on this angle is known as the Rayleigh criterion, whichwas developed by Lord Rayleigh in the nineteenth century.


Rayleigh Criterion
The diffraction limit to resolution states that two images are just resolvable when the center of the diffraction patternof one is directly over the first minimum of the diffraction pattern of the other (Figure 4.18(b)).


The first minimum is at an angle of θ = 1.22λ/D , so that two point objects are just resolvable if they are separated by the
angle


(4.5)θ = 1.22 λ
D


where λ is the wavelength of light (or other electromagnetic radiation) and D is the diameter of the aperture, lens, mirror,
etc., with which the two objects are observed. In this expression, θ has units of radians. This angle is also commonly known
as the diffraction limit.


Chapter 4 | Diffraction 163




Figure 4.18 (a) Graph of intensity of the diffraction pattern for a circular aperture. Notethat, similar to a single slit, the central maximum is wider and brighter than those to the sides.(b) Two point objects produce overlapping diffraction patterns. Shown here is the Rayleighcriterion for being just resolvable. The central maximum of one pattern lies on the firstminimum of the other.


All attempts to observe the size and shape of objects are limited by the wavelength of the probe. Even the small wavelengthof light prohibits exact precision. When extremely small wavelength probes are used, as with an electron microscope, thesystem is disturbed, still limiting our knowledge. Heisenberg’s uncertainty principle asserts that this limit is fundamentaland inescapable, as we shall see in the chapter on quantum mechanics.
Example 4.6


Calculating Diffraction Limits of the Hubble Space Telescope
The primary mirror of the orbiting Hubble Space Telescope has a diameter of 2.40 m. Being in orbit, this telescopeavoids the degrading effects of atmospheric distortion on its resolution. (a) What is the angle between two just-resolvable point light sources (perhaps two stars)? Assume an average light wavelength of 550 nm. (b) If thesetwo stars are at a distance of 2 million light-years, which is the distance of the Andromeda Galaxy, how closetogether can they be and still be resolved? (A light-year, or ly, is the distance light travels in 1 year.)
Strategy
The Rayleigh criterion stated in Equation 4.5, θ = 1.22λ/D , gives the smallest possible angle θ between point
sources, or the best obtainable resolution. Once this angle is known, we can calculate the distance between thestars, since we are given how far away they are.
Solutiona. The Rayleigh criterion for the minimum resolvable angle is


θ = 1.22 λ
D
.


Entering known values gives
θ = 1.22550 × 10


−9 m
2.40 m


= 2.80 × 10−7 rad.


b. The distance s between two objects a distance r away and separated by an angle θ is s = rθ.


164 Chapter 4 | Diffraction


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Substituting known values gives
s = ⎛⎝2.0 × 10


6 ly⎞⎠

⎝2.80 × 10


−7 rad⎞⎠ = 0.56 ly.


Significance
The angle found in part (a) is extraordinarily small (less than 1/50,000 of a degree), because the primary mirroris so large compared with the wavelength of light. As noticed, diffraction effects are most noticeable when lightinteracts with objects having sizes on the order of the wavelength of light. However, the effect is still there, andthere is a diffraction limit to what is observable. The actual resolution of the Hubble Telescope is not quite asgood as that found here. As with all instruments, there are other effects, such as nonuniformities in mirrors oraberrations in lenses that further limit resolution. However, Figure 4.19 gives an indication of the extent ofthe detail observable with the Hubble because of its size and quality, and especially because it is above Earth’satmosphere.


Figure 4.19 These two photographs of the M82 Galaxy give an idea of theobservable detail using (a) a ground-based telescope and (b) the Hubble SpaceTelescope. (credit a: modification of work by “Ricnun”/Wikimedia Commons)


The answer in part (b) indicates that two stars separated by about half a light-year can be resolved. The averagedistance between stars in a galaxy is on the order of five light-years in the outer parts and about one light-yearnear the galactic center. Therefore, the Hubble can resolve most of the individual stars in Andromeda Galaxy,even though it lies at such a huge distance that its light takes 2 million years to reach us. Figure 4.20 showsanother mirror used to observe radio waves from outer space.


Figure 4.20 A 305-m-diameter paraboloid at Arecibo inPuerto Rico is lined with reflective material, making it into aradio telescope. It is the largest curved focusing dish in theworld. Although D for Arecibo is much larger than for theHubble Telescope, it detects radiation of a much longerwavelength and its diffraction limit is significantly poorer thanHubble’s. The Arecibo telescope is still very useful, becauseimportant information is carried by radio waves that is notcarried by visible light. (credit: Jeff Hitchcock)


Chapter 4 | Diffraction 165




4.5 Check Your Understanding What is the angular resolution of the Arecibo telescope shown in Figure4.20 when operated at 21-cm wavelength? How does it compare to the resolution of the Hubble Telescope?


Diffraction is not only a problem for optical instruments but also for the electromagnetic radiation itself. Any beam of lighthaving a finite diameter D and a wavelength λ exhibits diffraction spreading. The beam spreads out with an angle θ given
by Equation 4.5, θ = 1.22λ/D . Take, for example, a laser beam made of rays as parallel as possible (angles between rays
as close to θ = 0° as possible) instead spreads out at an angle θ = 1.22λ/D , where D is the diameter of the beam and
λ is its wavelength. This spreading is impossible to observe for a flashlight because its beam is not very parallel to start
with. However, for long-distance transmission of laser beams or microwave signals, diffraction spreading can be significant(Figure 4.21). To avoid this, we can increase D. This is done for laser light sent to the moon to measure its distance fromEarth. The laser beam is expanded through a telescope to make D much larger and θ smaller.


Figure 4.21 The beam produced by this microwavetransmission antenna spreads out at a minimum angle
θ = 1.22λ/D due to diffraction. It is impossible to produce a
near-parallel beam because the beam has a limited diameter.


In most biology laboratories, resolution is an issue when the use of the microscope is introduced. The smaller the distance xby which two objects can be separated and still be seen as distinct, the greater the resolution. The resolving power of a lensis defined as that distance x. An expression for resolving power is obtained from the Rayleigh criterion. Figure 4.22(a)shows two point objects separated by a distance x. According to the Rayleigh criterion, resolution is possible when theminimum angular separation is
θ = 1.22 λ


D
= x


d
,


where d is the distance between the specimen and the objective lens, and we have used the small angle approximation (i.e.,we have assumed that x is much smaller than d), so that tan θ ≈ sin θ ≈ θ. Therefore, the resolving power is
x = 1.22λd


D
.


Another way to look at this is by the concept of numerical aperture (NA), which is a measure of the maximum acceptanceangle at which a lens will take light and still contain it within the lens. Figure 4.22(b) shows a lens and an object at pointP. The NA here is a measure of the ability of the lens to gather light and resolve fine detail. The angle subtended by the lensat its focus is defined to be θ = 2α . From the figure and again using the small angle approximation, we can write
sin α = D/2


d
= D


2d
.


The NA for a lens is NA = n sin α , where n is the index of refraction of the medium between the objective lens and the
object at point P. From this definition for NA, we can see that


x = 1.22λd
D


= 1.22 λ
2 sin α


= 0.61 λn
NA


.


166 Chapter 4 | Diffraction


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




In a microscope, NA is important because it relates to the resolving power of a lens. A lens with a large NA is able to resolvefiner details. Lenses with larger NA are also able to collect more light and so give a brighter image. Another way to describethis situation is that the larger the NA, the larger the cone of light that can be brought into the lens, so more of the diffractionmodes are collected. Thus the microscope has more information to form a clear image, and its resolving power is higher.


Figure 4.22 (a) Two points separated by a distance x and positioned adistance d away from the objective. (b) Terms and symbols used indiscussion of resolving power for a lens and an object at point P (credit a:modification of work by “Infopro”/Wikimedia Commons).


One of the consequences of diffraction is that the focal point of a beam has a finite width and intensity distribution. Imaginefocusing when only considering geometric optics, as in Figure 4.23(a). The focal point is regarded as an infinitely smallpoint with a huge intensity and the capacity to incinerate most samples, irrespective of the NA of the objective lens—anunphysical oversimplification. For wave optics, due to diffraction, we take into account the phenomenon in which thefocal point spreads to become a focal spot (Figure 4.23(b)) with the size of the spot decreasing with increasing NA.Consequently, the intensity in the focal spot increases with increasing NA. The higher the NA, the greater the chances ofphotodegrading the specimen. However, the spot never becomes a true point.


Figure 4.23 (a) In geometric optics, the focus is modelled as a point, but it is not physically possible to produce such apoint because it implies infinite intensity. (b) In wave optics, the focus is an extended region.


In a different type of microscope, molecules within a specimen are made to emit light through a mechanism calledfluorescence. By controlling the molecules emitting light, it has become possible to construct images with resolution muchfiner than the Rayleigh criterion, thus circumventing the diffraction limit. The development of super-resolved fluorescencemicroscopy led to the 2014 Nobel Prize in Chemistry.


Chapter 4 | Diffraction 167




In this Optical Resolution Model, two diffraction patterns for light through two circular apertures are shown sideby side in this simulation (https://openstaxcollege.org/l/21optresmodsim) by Fu-Kwun Hwang. Watchthe patterns merge as you decrease the aperture diameters.


4.6 | X-Ray Diffraction
Learning Objectives


By the end of this section, you will be able to:
• Describe interference and diffraction effects exhibited by X-rays in interaction with atomic-scalestructures


Since X-ray photons are very energetic, they have relatively short wavelengths, on the order of 10−8 m to 10−12 m.
Thus, typical X-ray photons act like rays when they encounter macroscopic objects, like teeth, and produce sharp shadows.However, since atoms are on the order of 0.1 nm in size, X-rays can be used to detect the location, shape, and size of atomsand molecules. The process is called X-ray diffraction, and it involves the interference of X-rays to produce patterns thatcan be analyzed for information about the structures that scattered the X-rays.
Perhaps the most famous example of X-ray diffraction is the discovery of the double-helical structure of DNA in 1953 by aninternational team of scientists working at England’s Cavendish Laboratory—American James Watson, Englishman FrancisCrick, and New Zealand-born Maurice Wilkins. Using X-ray diffraction data produced by Rosalind Franklin, they werethe first to model the double-helix structure of DNA that is so crucial to life. For this work, Watson, Crick, and Wilkinswere awarded the 1962 Nobel Prize in Physiology or Medicine. (There is some debate and controversy over the issue thatRosalind Franklin was not included in the prize, although she died in 1958, before the prize was awarded.)
Figure 4.24 shows a diffraction pattern produced by the scattering of X-rays from a crystal. This process is known as X-raycrystallography because of the information it can yield about crystal structure, and it was the type of data Rosalind Franklinsupplied to Watson and Crick for DNA. Not only do X-rays confirm the size and shape of atoms, they give informationabout the atomic arrangements in materials. For example, more recent research in high-temperature superconductorsinvolves complex materials whose lattice arrangements are crucial to obtaining a superconducting material. These can bestudied using X-ray crystallography.


Figure 4.24 X-ray diffraction from the crystal of a protein(hen egg lysozyme) produced this interference pattern. Analysisof the pattern yields information about the structure of theprotein. (credit: “Del45”/Wikimedia Commons)


168 Chapter 4 | Diffraction


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Historically, the scattering of X-rays from crystals was used to prove that X-rays are energetic electromagnetic (EM) waves.This was suspected from the time of the discovery of X-rays in 1895, but it was not until 1912 that the German Max vonLaue (1879–1960) convinced two of his colleagues to scatter X-rays from crystals. If a diffraction pattern is obtained, hereasoned, then the X-rays must be waves, and their wavelength could be determined. (The spacing of atoms in variouscrystals was reasonably well known at the time, based on good values for Avogadro’s number.) The experiments wereconvincing, and the 1914 Nobel Prize in Physics was given to von Laue for his suggestion leading to the proof that X-raysare EM waves. In 1915, the unique father-and-son team of Sir William Henry Bragg and his son Sir William LawrenceBragg were awarded a joint Nobel Prize for inventing the X-ray spectrometer and the then-new science of X-ray analysis.
In ways reminiscent of thin-film interference, we consider two plane waves at X-ray wavelengths, each one reflecting offa different plane of atoms within a crystal’s lattice, as shown in Figure 4.25. From the geometry, the difference in pathlengths is 2d sin θ . Constructive interference results when this distance is an integer multiple of the wavelength. This
condition is captured by the Bragg equation,


(4.6)mλ = 2d sin θ, m = 1, 2, 3 ...


where m is a positive integer and d is the spacing between the planes. Following the Law of Reflection, both the incidentand reflected waves are described by the same angle, θ, but unlike the general practice in geometric optics, θ is measured
with respect to the surface itself, rather than the normal.


Figure 4.25 X-ray diffraction with a crystal. Two incident waves reflect off twoplanes of a crystal. The difference in path lengths is indicated by the dashed line.


Example 4.7
X-Ray Diffraction with Salt Crystals
Common table salt is composed mainly of NaCl crystals. In a NaCl crystal, there is a family of planes 0.252 nmapart. If the first-order maximum is observed at an incidence angle of 18.1° , what is the wavelength of the X-ray
scattering from this crystal?
Strategy
Use the Bragg equation, Equation 4.6, mλ = 2d sin θ , to solve for θ .
Solution
For first-order, m = 1, and the plane spacing d is known. Solving the Bragg equation for wavelength yields


λ = 2d sin θm =
2⎛⎝0.252 × 10


−9 m⎞⎠ sin (18.1°)


1
= 1.57 × 10−10 m, or 0.157 nm.


Significance
The determined wavelength fits within the X-ray region of the electromagnetic spectrum. Once again, the wave


Chapter 4 | Diffraction 169




4.6


nature of light makes itself prominent when the wavelength (λ = 0.157 nm) is comparable to the size of the
physical structures (d = 0.252 nm⎞⎠ it interacts with.


Check Your Understanding For the experiment described in Example 4.7, what are the two otherangles where interference maxima may be observed? What limits the number of maxima?


Although Figure 4.25 depicts a crystal as a two-dimensional array of scattering centers for simplicity, real crystals arestructures in three dimensions. Scattering can occur simultaneously from different families of planes at different orientationsand spacing patterns known as called Bragg planes, as shown in Figure 4.26. The resulting interference pattern can bequite complex.


Figure 4.26 Because of the regularity that makes a crystal structure, onecrystal can have many families of planes within its geometry, each one givingrise to X-ray diffraction.


4.7 | Holography
Learning Objectives


By the end of this section, you will be able to:
• Describe how a three-dimensional image is recorded as a hologram
• Describe how a three-dimensional image is formed from a hologram


A hologram, such as the one in Figure 4.27, is a true three-dimensional image recorded on film by lasers. Hologramsare used for amusement; decoration on novelty items and magazine covers; security on credit cards and driver’s licenses (alaser and other equipment are needed to reproduce them); and for serious three-dimensional information storage. You cansee that a hologram is a true three-dimensional image because objects change relative position in the image when viewedfrom different angles.


170 Chapter 4 | Diffraction


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 4.27 Credit cards commonly have holograms forlogos, making them difficult to reproduce. (credit: DominicAlves)


The name hologram means “entire picture” (from the Greek holo, as in holistic) because the image is three-dimensional.Holography is the process of producing holograms and, although they are recorded on photographic film, the process isquite different from normal photography. Holography uses light interference or wave optics, whereas normal photographyuses geometric optics. Figure 4.28 shows one method of producing a hologram. Coherent light from a laser is split bya mirror, with part of the light illuminating the object. The remainder, called the reference beam, shines directly on apiece of film. Light scattered from the object interferes with the reference beam, producing constructive and destructiveinterference. As a result, the exposed film looks foggy, but close examination reveals a complicated interference patternstored on it. Where the interference was constructive, the film (a negative actually) is darkened. Holography is sometimescalled lens-less photography, because it uses the wave characteristics of light, as contrasted to normal photography, whichuses geometric optics and requires lenses.


Figure 4.28 Production of a hologram. Single-wavelengthcoherent light from a laser produces a well-defined interferencepattern on a piece of film. The laser beam is split by a partiallysilvered mirror, with part of the light illuminating the object andthe remainder shining directly on the film. (credit: modificationof work by Mariana Ruiz Villarreal)


Light falling on a hologram can form a three-dimensional image of the original object. The process is complicated in detail,but the basics can be understood, as shown in Figure 4.29, in which a laser of the same type that exposed the film is nowused to illuminate it. The myriad tiny exposed regions of the film are dark and block the light, whereas less exposed regionsallow light to pass. The film thus acts much like a collection of diffraction gratings with various spacing patterns. Lightpassing through the hologram is diffracted in various directions, producing both real and virtual images of the object usedto expose the film. The interference pattern is the same as that produced by the object. Moving your eye to various places inthe interference pattern gives you different perspectives, just as looking directly at the object would. The image thus lookslike the object and is three dimensional like the object.


Chapter 4 | Diffraction 171




Figure 4.29 A transmission hologram is one that produces real andvirtual images when a laser of the same type as that which exposed thehologram is passed through it. Diffraction from various parts of the filmproduces the same interference pattern that was produced by the objectthat was used to expose it. (credit: modification of work by MarianaRuiz Villarreal)


The hologram illustrated in Figure 4.29 is a transmission hologram. Holograms that are viewed with reflected light, suchas the white light holograms on credit cards, are reflection holograms and are more common. White light holograms oftenappear a little blurry with rainbow edges, because the diffraction patterns of various colors of light are at slightly differentlocations due to their different wavelengths. Further uses of holography include all types of three-dimensional informationstorage, such as of statues in museums, engineering studies of structures, and images of human organs.
Invented in the late 1940s by Dennis Gabor (1900–1970), who won the 1971 Nobel Prize in Physics for his work,holography became far more practical with the development of the laser. Since lasers produce coherent single-wavelengthlight, their interference patterns are more pronounced. The precision is so great that it is even possible to record numerousholograms on a single piece of film by just changing the angle of the film for each successive image. This is how theholograms that move as you walk by them are produced—a kind of lens-less movie.
In a similar way, in the medical field, holograms have allowed complete three-dimensional holographic displays of objectsfrom a stack of images. Storing these images for future use is relatively easy. With the use of an endoscope, high-resolution,three-dimensional holographic images of internal organs and tissues can be made.


172 Chapter 4 | Diffraction


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Bragg planes
destructive interference for a single slit
diffraction
diffraction grating
diffraction limit
hologram
holography
missing order
Rayleigh criterion
resolution
two-slit diffraction pattern
width of the central peak
X-ray diffraction


CHAPTER 4 REVIEW
KEY TERMS


families of planes within crystals that can give rise to X-ray diffraction
occurs when the width of the slit is comparable to the wavelength of lightilluminating it


bending of a wave around the edges of an opening or an obstacle
large number of evenly spaced parallel slits


fundamental limit to resolution due to diffraction
three-dimensional image recorded on film by lasers; the word hologram means entire picture (from the Greekword holo, as in holistic)
process of producing holograms with the use of lasers
interference maximum that is not seen because it coincides with a diffraction minimum


two images are just-resolvable when the center of the diffraction pattern of one is directly over thefirst minimum of the diffraction pattern of the other
ability, or limit thereof, to distinguish small details in images


diffraction pattern of two slits of width D that are separated by a distance d is theinterference pattern of two point sources separated by d multiplied by the diffraction pattern of a slit of width D
angle between the minimum for m = 1 and m = −1


technique that provides the detailed information about crystallographic structure of natural andmanufactured materials


KEY EQUATIONS
Destructive interference for a single slit D sin θ = mλ for m = ± 1, ± 2, ± 3, ...
Half phase angle β = ϕ


2
= πD sin θ


λ


Field amplitude in the diffraction pattern E = NΔE0 sin ββ
Intensity in the diffraction pattern


I = I0


sin β
β



2


Rayleigh criterion for circular apertures θ = 1.22 λ
D


Bragg equation mλ = 2d sin θ, m = 1, 2, 3...
SUMMARY
4.1 Single-Slit Diffraction


• Diffraction can send a wave around the edges of an opening or other obstacle.
• A single slit produces an interference pattern characterized by a broad central maximum with narrower and dimmermaxima to the sides.


Chapter 4 | Diffraction 173




4.2 Intensity in Single-Slit Diffraction
• The intensity pattern for diffraction due to a single slit can be calculated using phasors as


I = I0


sin β
β



2


,


where β = ϕ
2


= πD sin θ
λ


, D is the slit width, λ is the wavelength, and θ is the angle from the central peak.


4.3 Double-Slit Diffraction
• With real slits with finite widths, the effects of interference and diffraction operate simultaneously to form acomplicated intensity pattern.
• Relative intensities of interference fringes within a diffraction pattern can be determined.
• Missing orders occur when an interference maximum and a diffraction minimum are located together.


4.4 Diffraction Gratings
• A diffraction grating consists of a large number of evenly spaced parallel slits that produce an interference patternsimilar to but sharper than that of a double slit.
• Constructive interference occurs when d sin θ = mλ for m = 0, ± 1, ± 2, ..., where d is the distance between
the slits, θ is the angle relative to the incident direction, and m is the order of the interference.


4.5 Circular Apertures and Resolution
• Diffraction limits resolution.
• The Rayleigh criterion states that two images are just resolvable when the center of the diffraction pattern of one isdirectly over the first minimum of the diffraction pattern of the other.


4.6 X-Ray Diffraction
• X-rays are relatively short-wavelength EM radiation and can exhibit wave characteristics such as interference wheninteracting with correspondingly small objects.


4.7 Holography
• Holography is a technique based on wave interference to record and form three-dimensional images.
• Lasers offer a practical way to produce sharp holographic images because of their monochromatic and coherentlight for pronounced interference patterns.


CONCEPTUAL QUESTIONS
4.1 Single-Slit Diffraction
1. As the width of the slit producing a single-slitdiffraction pattern is reduced, how will the diffractionpattern produced change?
2. Compare interference and diffraction.
3. If you and a friend are on opposite sides of a hill,you can communicate with walkie-talkies but not withflashlights. Explain.


4. What happens to the diffraction pattern of a single slitwhen the entire optical apparatus is immersed in water?
5. In our study of diffraction by a single slit, we assumethat the length of the slit is much larger than the width.What happens to the diffraction pattern if these twodimensions were comparable?
6. A rectangular slit is twice as wide as it is high. Is thecentral diffraction peak wider in the vertical direction or inthe horizontal direction?


174 Chapter 4 | Diffraction


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




4.2 Intensity in Single-Slit Diffraction
7. In Equation 4.4, the parameter β looks like an angle
but is not an angle that you can measure with a protractorin the physical world. Explain what β represents.


4.3 Double-Slit Diffraction
8. Shown below is the central part of the interferencepattern for a pure wavelength of red light projected onto adouble slit. The pattern is actually a combination of single-and double-slit interference. Note that the bright spots areevenly spaced. Is this a double- or single-slit characteristic?Note that some of the bright spots are dim on either sideof the center. Is this a single- or double-slit characteristic?Which is smaller, the slit width or the separation betweenslits? Explain your responses.


4.5 Circular Apertures and Resolution
9. Is higher resolution obtained in a microscope with redor blue light? Explain your answer.
10. The resolving power of refracting telescope increaseswith the size of its objective lens. What other advantage is


gained with a larger lens?
11. The distance between atoms in a molecule is about
10−8 cm . Can visible light be used to “see” molecules?
12. A beam of light always spreads out. Why can a beamnot be created with parallel rays to prevent spreading? Whycan lenses, mirrors, or apertures not be used to correct thespreading?


4.6 X-Ray Diffraction
13. Crystal lattices can be examined with X-rays but notUV. Why?


4.7 Holography
14. How can you tell that a hologram is a true three-dimensional image and that those in three-dimensionalmovies are not?
15. If a hologram is recorded using monochromatic lightat one wavelength but its image is viewed at anotherwavelength, say 10% shorter, what will you see? What
if it is viewed using light of exactly half the originalwavelength?
16. What image will one see if a hologram is recordedusing monochromatic light but its image is viewed in whitelight? Explain.


PROBLEMS
4.1 Single-Slit Diffraction
17. (a) At what angle is the first minimum for 550-nmlight falling on a single slit of width 1.00µm ? (b) Will
there be a second minimum?
18. (a) Calculate the angle at which a 2.00-µm -wide
slit produces its first minimum for 410-nm violet light. (b)Where is the first minimum for 700-nm red light?
19. (a) How wide is a single slit that produces its firstminimum for 633-nm light at an angle of 28.0° ? (b) At
what angle will the second minimum be?
20. (a) What is the width of a single slit that producesits first minimum at 60.0° for 600-nm light? (b) Find the
wavelength of light that has its first minimum at 62.0° .
21. Find the wavelength of light that has its third


minimum at an angle of 48.6° when it falls on a single slit
of width 3.00µm .
22. (a) Sodium vapor light averaging 589 nm inwavelength falls on a single slit of width 7.50µm . At what
angle does it produces its second minimum? (b) What is thehighest-order minimum produced?
23. Consider a single-slit diffraction pattern for
λ = 589 nm , projected on a screen that is 1.00 m from
a slit of width 0.25 mm. How far from the center of thepattern are the centers of the first and second dark fringes?
24. (a) Find the angle between the first minima for thetwo sodium vapor lines, which have wavelengths of 589.1and 589.6 nm, when they fall upon a single slit of width
2.00µm . (b) What is the distance between these minima
if the diffraction pattern falls on a screen 1.00 m from theslit? (c) Discuss the ease or difficulty of measuring such a


Chapter 4 | Diffraction 175




distance.
25. (a) What is the minimum width of a single slit (inmultiples of λ ) that will produce a first minimum for
a wavelength λ ? (b) What is its minimum width if it
produces 50 minima? (c) 1000 minima?
26. (a) If a single slit produces a first minimum at 14.5°,
at what angle is the second-order minimum? (b) What isthe angle of the third-order minimum? (c) Is there a fourth-order minimum? (d) Use your answers to illustrate how theangular width of the central maximum is about twice theangular width of the next maximum (which is the anglebetween the first and second minima).
27. If the separation between the first and the secondminima of a single-slit diffraction pattern is 6.0 mm, whatis the distance between the screen and the slit? The lightwavelength is 500 nm and the slit width is 0.16 mm.
28. A water break at the entrance to a harbor consists ofa rock barrier with a 50.0-m-wide opening. Ocean wavesof 20.0-m wavelength approach the opening straight on. Atwhat angles to the incident direction are the boats inside theharbor most protected against wave action?
29. An aircraft maintenance technician walks past a tallhangar door that acts like a single slit for sound enteringthe hangar. Outside the door, on a line perpendicular to theopening in the door, a jet engine makes a 600-Hz sound.At what angle with the door will the technician observe thefirst minimum in sound intensity if the vertical opening is0.800 m wide and the speed of sound is 340 m/s?


4.2 Intensity in Single-Slit Diffraction
30. A single slit of width 3.0 µm is illuminated by a
sodium yellow light of wavelength 589 nm. Find theintensity at a 15° angle to the axis in terms of the intensity
of the central maximum.
31. A single slit of width 0.1 mm is illuminated by amercury light of wavelength 576 nm. Find the intensity at a
10° angle to the axis in terms of the intensity of the central
maximum.
32. The width of the central peak in a single-slitdiffraction pattern is 5.0 mm. The wavelength of the lightis 600 nm, and the screen is 2.0 m from the slit. (a) What isthe width of the slit? (b) Determine the ratio of the intensityat 4.5 mm from the center of the pattern to the intensity atthe center.
33. Consider the single-slit diffraction pattern for
λ = 600 nm , D = 0.025 mm , and x = 2.0 m . Find the


intensity in terms of Io at θ = 0.5° , 1.0° , 1.5° , 3.0° ,
and 10.0° .


4.3 Double-Slit Diffraction
34. Two slits of width 2 µm, each in an opaque material,
are separated by a center-to-center distance of 6 µm. A
monochromatic light of wavelength 450 nm is incidenton the double-slit. One finds a combined interference anddiffraction pattern on the screen.
(a) How many peaks of the interference will be observed inthe central maximum of the diffraction pattern?
(b) How many peaks of the interference will be observedif the slit width is doubled while keeping the distancebetween the slits same?
(c) How many peaks of interference will be observed if theslits are separated by twice the distance, that is, 12 µm,
while keeping the widths of the slits same?
(d) What will happen in (a) if instead of 450-nm lightanother light of wavelength 680 nm is used?
(e) What is the value of the ratio of the intensity of thecentral peak to the intensity of the next bright peak in (a)?
(f) Does this ratio depend on the wavelength of the light?
(g) Does this ratio depend on the width or separation of theslits?
35. A double slit produces a diffraction pattern that is acombination of single- and double-slit interference. Findthe ratio of the width of the slits to the separation betweenthem, if the first minimum of the single-slit pattern fallson the fifth maximum of the double-slit pattern. (This willgreatly reduce the intensity of the fifth maximum.)
36. For a double-slit configuration where the slitseparation is four times the slit width, how manyinterference fringes lie in the central peak of the diffractionpattern?
37. Light of wavelength 500 nm falls normally on 50
slits that are 2.5 × 10−3 mm wide and spaced
5.0 × 10−3 mm apart. How many interference fringes lie
in the central peak of the diffraction pattern?
38. A monochromatic light of wavelength 589 nmincident on a double slit with slit width 2.5 µm and
unknown separation results in a diffraction patterncontaining nine interference peaks inside the centralmaximum. Find the separation of the slits.
39. When a monochromatic light of wavelength 430 nmincident on a double slit of slit separation 5 µm, there are


176 Chapter 4 | Diffraction


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




11 interference fringes in its central maximum. How manyinterference fringes will be in the central maximum of alight of wavelength 632.8 nm for the same double slit?
40. Determine the intensities of two interference peaksother than the central peak in the central maximum ofthe diffraction, if possible, when a light of wavelength628 nm is incident on a double slit of width 500 nm andseparation 1500 nm. Use the intensity of the central spot to
be 1 mW/cm2 .


4.4 Diffraction Gratings
41. A diffraction grating has 2000 lines per centimeter. Atwhat angle will the first-order maximum be for 520-nm-wavelength green light?
42. Find the angle for the third-order maximum for580-nm-wavelength yellow light falling on a difractiongrating having 1500 lines per centimeter.
43. How many lines per centimeter are there on adiffraction grating that gives a first-order maximum for470-nm blue light at an angle of 25.0° ?
44. What is the distance between lines on a diffractiongrating that produces a second-order maximum for 760-nmred light at an angle of 60.0° ?
45. Calculate the wavelength of light that has its second-order maximum at 45.0° when falling on a diffraction
grating that has 5000 lines per centimeter.
46. An electric current through hydrogen gas producesseveral distinct wavelengths of visible light. What are thewavelengths of the hydrogen spectrum, if they form first-order maxima at angles 24.2°, 25.7°, 29.1°, and 41.0°
when projected on a diffraction grating having 10,000 linesper centimeter?
47. (a) What do the four angles in the preceding problembecome if a 5000-line per centimeter diffraction gratingis used? (b) Using this grating, what would the angles befor the second-order maxima? (c) Discuss the relationshipbetween integral reductions in lines per centimeter and thenew angles of various order maxima.
48. What is the spacing between structures in a feather thatacts as a reflection grating, giving that they produce a first-order maximum for 525-nm light at a 30.0° angle?
49. An opal such as that shown in Figure 4.15 acts likea reflection grating with rows separated by about 8 µm. If
the opal is illuminated normally, (a) at what angle will red


light be seen and (b) at what angle will blue light be seen?
50. At what angle does a diffraction grating produce asecond-order maximum for light having a first-ordermaximum at 20.0° ?
51. (a) Find the maximum number of lines per centimetera diffraction grating can have and produce a maximum forthe smallest wavelength of visible light. (b) Would such agrating be useful for ultraviolet spectra? (c) For infraredspectra?
52. (a) Show that a 30,000 line per centimeter gratingwill not produce a maximum for visible light. (b) What isthe longest wavelength for which it does produce a first-order maximum? (c) What is the greatest number of lineper centimeter a diffraction grating can have and produce acomplete second-order spectrum for visible light?
53. The analysis shown below also applies to diffractiongratings with lines separated by a distance d. What is thedistance between fringes produced by a diffraction gratinghaving 125 lines per centimeter for 600-nm light, if thescreen is 1.50 m away? (Hint: The distance betweenadjacent fringes is Δy = xλ/d, assuming the slit
separation d is comparable to λ. )


4.5 Circular Apertures and Resolution
54. The 305-m-diameter Arecibo radio telescope picturedin Figure 4.20 detects radio waves with a 4.00-cm averagewavelength. (a) What is the angle between two just-resolvable point sources for this telescope? (b) How closetogether could these point sources be at the 2 million light-year distance of the Andromeda Galaxy?
55. Assuming the angular resolution found for the HubbleTelescope in Example 4.6, what is the smallest detail thatcould be observed on the moon?
56. Diffraction spreading for a flashlight is insignificantcompared with other limitations in its optics, such as


Chapter 4 | Diffraction 177




spherical aberrations in its mirror. To show this, calculatethe minimum angular spreading of a flashlight beam that isoriginally 5.00 cm in diameter with an average wavelengthof 600 nm.
57. (a) What is the minimum angular spread of a 633-nmwavelength He-Ne laser beam that is originally 1.00 mm indiameter? (b) If this laser is aimed at a mountain cliff 15.0km away, how big will the illuminated spot be? (c) Howbig a spot would be illuminated on the moon, neglectingatmospheric effects? (This might be done to hit a cornerreflector to measure the round-trip time and, hence,distance.)
58. A telescope can be used to enlarge the diameter of alaser beam and limit diffraction spreading. The laser beamis sent through the telescope in opposite the normaldirection and can then be projected onto a satellite or themoon. (a) If this is done with the Mount Wilson telescope,producing a 2.54-m-diameter beam of 633-nm light, whatis the minimum angular spread of the beam? (b) Neglectingatmospheric effects, what is the size of the spot this beamwould make on the moon, assuming a lunar distance of
3.84 × 108 m ?
59. The limit to the eye’s acuity is actually related todiffraction by the pupil. (a) What is the angle betweentwo just-resolvable points of light for a 3.00-mm-diameterpupil, assuming an average wavelength of 550 nm? (b)Take your result to be the practical limit for the eye. Whatis the greatest possible distance a car can be from you if youcan resolve its two headlights, given they are 1.30 m apart?(c) What is the distance between two just-resolvable pointsheld at an arm’s length (0.800 m) from your eye? (d) Howdoes your answer to (c) compare to details you normallyobserve in everyday circumstances?
60. What is the minimum diameter mirror on a telescopethat would allow you to see details as small as 5.00 kmon the moon some 384,000 km away? Assume an averagewavelength of 550 nm for the light received.
61. Find the radius of a star’s image on the retina of aneye if its pupil is open to 0.65 cm and the distance from thepupil to the retina is 2.8 cm. Assume λ = 550 nm .
62. (a) The dwarf planet Pluto and its moon, Charon, areseparated by 19,600 km. Neglecting atmospheric effects,should the 5.08-m-diameter Palomar Mountain telescopebe able to resolve these bodies when they are
4.50 × 109 km from Earth? Assume an average
wavelength of 550 nm. (b) In actuality, it is just barelypossible to discern that Pluto and Charon are separatebodies using a ground-based telescope. What are thereasons for this?


63. A spy satellite orbits Earth at a height of 180 km. Whatis the minimum diameter of the objective lens in a telescopethat must be used to resolve columns of troops marching2.0 m apart? Assume λ = 550 nm.
64. What is the minimum angular separation of two starsthat are just-resolvable by the 8.1-m Gemini Southtelescope, if atmospheric effects do not limit resolution?Use 550 nm for the wavelength of the light from the stars.
65. The headlights of a car are 1.3 m apart. What is themaximum distance at which the eye can resolve these twoheadlights? Take the pupil diameter to be 0.40 cm.
66. When dots are placed on a page from a laser printer,they must be close enough so that you do not see theindividual dots of ink. To do this, the separation of the dotsmust be less than Raleigh’s criterion. Take the pupil of theeye to be 3.0 mm and the distance from the paper to the eyeof 35 cm; find the minimum separation of two dots suchthat they cannot be resolved. How many dots per inch (dpi)does this correspond to?
67. Suppose you are looking down at a highway from ajetliner flying at an altitude of 6.0 km. How far apart musttwo cars be if you are able to distinguish them? Assumethat λ = 550 nm and that the diameter of your pupils is 4.0
mm.
68. Can an astronaut orbiting Earth in a satellite at adistance of 180 km from the surface distinguish twoskyscrapers that are 20 m apart? Assume that the pupils ofthe astronaut’s eyes have a diameter of 5.0 mm and thatmost of the light is centered around 500 nm.
69. The characters of a stadium scoreboard are formedwith closely spaced lightbulbs that radiate primarily yellowlight. (Use λ = 600 nm. ) How closely must the bulbs be
spaced so that an observer 80 m away sees a display ofcontinuous lines rather than the individual bulbs? Assumethat the pupil of the observer’s eye has a diameter of 5.0mm.
70. If a microscope can accept light from objects at anglesas large as α = 70° , what is the smallest structure that can
be resolved when illuminated with light of wavelength 500nm and (a) the specimen is in air? (b) When the specimenis immersed in oil, with index of refraction of 1.52?
71. A camera uses a lens with aperture 2.0 cm. What isthe angular resolution of a photograph taken at 700 nmwavelength? Can it resolve the millimeter markings of aruler placed 35 m away?


178 Chapter 4 | Diffraction


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




4.6 X-Ray Diffraction
72. X-rays of wavelength 0.103 nm reflects off a crystaland a second-order maximum is recorded at a Bragg angleof 25.5° . What is the spacing between the scattering
planes in this crystal?
73. A first-order Bragg reflection maximum is observedwhen a monochromatic X-ray falls on a crystal at a 32.3°
angle to a reflecting plane. What is the wavelength of thisX-ray?
74. An X-ray scattering experiment is performed on acrystal whose atoms form planes separated by 0.440 nm.Using an X-ray source of wavelength 0.548 nm, what isthe angle (with respect to the planes in question) at whichthe experimenter needs to illuminate the crystal in order toobserve a first-order maximum?
75. The structure of the NaCl crystal forms reflectingplanes 0.541 nm apart. What is the smallest angle,measured from these planes, at which X-ray diffraction can


be observed, if X-rays of wavelength 0.085 nm are used?
76. On a certain crystal, a first-order X-ray diffractionmaximum is observed at an angle of 27.1° relative to its
surface, using an X-ray source of unknown wavelength.Additionally, when illuminated with a different, this time ofknown wavelength 0.137 nm, a second-order maximum isdetected at 37.3°. Determine (a) the spacing between the
reflecting planes, and (b) the unknown wavelength.
77. Calcite crystals contain scattering planes separated by0.30 nm. What is the angular separation between first andsecond-order diffraction maxima when X-rays of 0.130 nmwavelength are used?
78. The first-order Bragg angle for a certain crystal is
12.1° . What is the second-order angle?


ADDITIONAL PROBLEMS
79. White light falls on two narrow slits separated by0.40 mm. The interference pattern is observed on a screen3.0 m away. (a) What is the separation between the firstmaxima for red light (λ = 700 nm⎞⎠ and violet light
(λ = 400 nm⎞⎠? (b) At what point nearest the central
maximum will a maximum for yellow light (λ = 600 nm⎞⎠
coincide with a maximum for violet light? Identify theorder for each maximum.
80. Microwaves of wavelength 10.0 mm fall normally ona metal plate that contains a slit 25 mm wide. (a) Where arethe first minima of the diffraction pattern? (b) Would therebe minima if the wavelength were 30.0 mm?
81. Quasars, or quasi-stellar radio sources, areastronomical objects discovered in 1960. They are distantbut strong emitters of radio waves with angular size sosmall, they were originally unresolved, the same as stars.The quasar 3C405 is actually two discrete radio sourcesthat subtend an angle of 82 arcsec. If this object is studiedusing radio emissions at a frequency of 410 MHz, what isthe minimum diameter of a radio telescope that can resolvethe two sources?
82. Two slits each of width 1800 nm and separated bythe center-to-center distance of 1200 nm are illuminatedby plane waves from a krypton ion laser-emitting atwavelength 461.9 nm. Find the number of interferencepeaks in the central diffraction peak.


83. A microwave of an unknown wavelength is incidenton a single slit of width 6 cm. The angular width of thecentral peak is found to be 25° . Find the wavelength.
84. Red light (wavelength 632.8 nm in air) from a Helium-Neon laser is incident on a single slit of width 0.05 mm.The entire apparatus is immersed in water of refractiveindex 1.333. Determine the angular width of the centralpeak.
85. A light ray of wavelength 461.9 nm emerges froma 2-mm circular aperture of a krypton ion laser. Due todiffraction, the beam expands as it moves out. How large isthe central bright spot at (a) 1 m, (b) 1 km, (c) 1000 km,and (d) at the surface of the moon at a distance of 400,000km from Earth.
86. How far apart must two objects be on the moon tobe distinguishable by eye if only the diffraction effects ofthe eye’s pupil limit the resolution? Assume 550 nm forthe wavelength of light, the pupil diameter 5.0 mm, and400,000 km for the distance to the moon.
87. How far apart must two objects be on the moon to beresolvable by the 8.1-m-diameter Gemini North telescopeat Mauna Kea, Hawaii, if only the diffraction effects of thetelescope aperture limit the resolution? Assume 550 nm forthe wavelength of light and 400,000 km for the distance tothe moon.


Chapter 4 | Diffraction 179




88. A spy satellite is reputed to be able to resolve objects10. cm apart while operating 197 km above the surface ofEarth. What is the diameter of the aperture of the telescopeif the resolution is only limited by the diffraction effects?Use 550 nm for light.
89. Monochromatic light of wavelength 530 nm passesthrough a horizontal single slit of width 1.5 µm in an
opaque plate. A screen of dimensions 2.0 m × 2.0 m is
1.2 m away from the slit. (a) Which way is the diffractionpattern spread out on the screen? (b) What are the anglesof the minima with respect to the center? (c) What are theangles of the maxima? (d) How wide is the central brightfringe on the screen? (e) How wide is the next bright fringeon the screen?
90. A monochromatic light of unknown wavelength isincident on a slit of width 20 µm . A diffraction pattern is
seen at a screen 2.5 m away where the central maximum isspread over a distance of 10.0 cm. Find the wavelength.
91. A source of light having two wavelengths 550 nm and600 nm of equal intensity is incident on a slit of width
1.8 µm . Find the separation of the m = 1 bright spots of
the two wavelengths on a screen 30.0 cm away.
92. A single slit of width 2100 nm is illuminated normallyby a wave of wavelength 632.8 nm. Find the phasedifference between waves from the top and one third fromthe bottom of the slit to a point on a screen at a horizontaldistance of 2.0 m and vertical distance of 10.0 cm from thecenter.
93. A single slit of width 3.0 µm is illuminated by a
sodium yellow light of wavelength 589 nm. Find theintensity at a 15° angle to the axis in terms of the intensity
of the central maximum.
94. A single slit of width 0.10 mm is illuminated by amercury lamp of wavelength 576 nm. Find the intensity at a
10° angle to the axis in terms of the intensity of the central
maximum.
95. A diffraction grating produces a second maximum thatis 89.7 cm from the central maximum on a screen 2.0 maway. If the grating has 600 lines per centimeter, what isthe wavelength of the light that produces the diffractionpattern?
96. A grating with 4000 lines per centimeter is used todiffract light that contains all wavelengths between 400 and650 nm. How wide is the first-order spectrum on a screen3.0 m from the grating?
97. A diffraction grating with 2000 lines per centimeter is


used to measure the wavelengths emitted by a hydrogen gasdischarge tube. (a) At what angles will you find the maximaof the two first-order blue lines of wavelengths 410 and434 nm? (b) The maxima of two other first-order lines arefound at θ1 = 0.097 rad and θ2 = 0.132 rad . What are
the wavelengths of these lines?
98. For white light (400 nm < λ< 700 nm) falling
normally on a diffraction grating, show that the secondand third-order spectra overlap no matter what the gratingconstant d is.
99. How many complete orders of the visible spectrum
(400 nm < λ< 700 nm) can be produced with a diffraction
grating that contains 5000 lines per centimeter?
100. Two lamps producing light of wavelength 589 nm arefixed 1.0 m apart on a wooden plank. What is the maximumdistance an observer can be and still resolve the lamps astwo separate sources of light, if the resolution is affectedsolely by the diffraction of light entering the eye? Assumelight enters the eye through a pupil of diameter 4.5 mm.
101. On a bright clear day, you are at the top of a mountainand looking at a city 12 km away. There are two tall towers20.0 m apart in the city. Can your eye resolve the twotowers if the diameter of the pupil is 4.0 mm? If not,what should be the minimum magnification power of thetelescope needed to resolve the two towers? In yourcalculations use 550 nm for the wavelength of the light.
102. Radio telescopes are telescopes used for the detectionof radio emission from space. Because radio waves havemuch longer wavelengths than visible light, the diameterof a radio telescope must be very large to provide goodresolution. For example, the radio telescope in Penticton,BC in Canada, has a diameter of 26 m and can be operatedat frequencies as high as 6.6 GHz. (a) What is thewavelength corresponding to this frequency? (b) What isthe angular separation of two radio sources that can beresolved by this telescope? (c) Compare the telescope’sresolution with the angular size of the moon.


180 Chapter 4 | Diffraction


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 4.30 (credit: Jason Nishiyama)
103. Calculate the wavelength of light that produces itsfirst minimum at an angle of 36.9° when falling on a
single slit of width 1.00 µm .
104. (a) Find the angle of the third diffraction minimumfor 633-nm light falling on a slit of width 20.0 µm . (b)
What slit width would place this minimum at 85.0° ?
105. As an example of diffraction by apertures ofeveryday dimensions, consider a doorway of width 1.0 m.(a) What is the angular position of the first minimum inthe diffraction pattern of 600-nm light? (b) Repeat thiscalculation for a musical note of frequency 440 Hz (Aabove middle C). Take the speed of sound to be 343 m/s.
106. What are the angular positions of the first and secondminima in a diffraction pattern produced by a slit of width0.20 mm that is illuminated by 400 nm light? What is theangular width of the central peak?
107. How far would you place a screen from the slit of theprevious problem so that the second minimum is a distanceof 2.5 mm from the center of the diffraction pattern?
108. How narrow is a slit that produces a diffractionpattern on a screen 1.8 m away whose central peak is 1.0 mwide? Assume λ = 589 nm .
109. Suppose that the central peak of a single-slitdiffraction pattern is so wide that the first minima can be


assumed to occur at angular positions of ±90° . For this
case, what is the ratio of the slit width to the wavelength ofthe light?
110. The central diffraction peak of the double-slitinterference pattern contains exactly nine fringes. What isthe ratio of the slit separation to the slit width?
111. Determine the intensities of three interference peaksother than the central peak in the central maximum of thediffraction, if possible, when a light of wavelength 500 nmis incident normally on a double slit of width 1000 nm andseparation 1500 nm. Use the intensity of the central spot to
be 1 mW/cm2 .
112. The yellow light from a sodium vapor lamp seemsto be of pure wavelength, but it produces two first-ordermaxima at 36.093° and 36.129° when projected on a
10,000 line per centimeter diffraction grating. What are thetwo wavelengths to an accuracy of 0.1 nm?
113. Structures on a bird feather act like a reflectiongrating having 8000 lines per centimeter. What is the angleof the first-order maximum for 600-nm light?
114. If a diffraction grating produces a first-ordermaximum for the shortest wavelength of visible light at
30.0° , at what angle will the first-order maximum be for
the largest wavelength of visible light?
115. (a) What visible wavelength has its fourth-ordermaximum at an angle of 25.0° when projected on a
25,000-line per centimeter diffraction grating? (b) What isunreasonable about this result? (c) Which assumptions areunreasonable or inconsistent?
116. Consider a spectrometer based on a diffractiongrating. Construct a problem in which you calculate thedistance between two wavelengths of electromagneticradiation in your spectrometer. Among the things to beconsidered are the wavelengths you wish to be able todistinguish, the number of lines per meter on the diffractiongrating, and the distance from the grating to the screen ordetector. Discuss the practicality of the device in terms ofbeing able to discern between wavelengths of interest.
117. An amateur astronomer wants to build a telescopewith a diffraction limit that will allow him to see if there arepeople on the moons of Jupiter. (a) What diameter mirror isneeded to be able to see 1.00-m detail on a Jovian moon at
a distance of 7.50 × 108 km from Earth? The wavelength
of light averages 600 nm. (b) What is unreasonable aboutthis result? (c) Which assumptions are unreasonable orinconsistent?


Chapter 4 | Diffraction 181




CHALLENGE PROBLEMS
118. Blue light of wavelength 450 nm falls on a slit ofwidth 0.25 mm. A converging lens of focal length 20 cm isplaced behind the slit and focuses the diffraction pattern ona screen. (a) How far is the screen from the lens? (b) Whatis the distance between the first and the third minima of thediffraction pattern?
119. (a) Assume that the maxima are halfway betweenthe minima of a single-slit diffraction pattern. The use thediameter and circumference of the phasor diagram, asdescribed in Intensity in Single-Slit Diffraction, todetermine the intensities of the third and fourth maxima interms of the intensity of the central maximum. (b) Do thesame calculation, using Equation 4.4.
120. (a) By differentiating Equation 4.4, show that thehigher-order maxima of the single-slit diffraction patternoccur at values of β that satisfy tan β = β . (b) Plot
y = tan β and y = β versus β and find the intersections
of these two curves. What information do they give youabout the locations of the maxima? (c) Convince yourself
that these points do not appear exactly at β = ⎛⎝n + 12⎞⎠π,
where n = 0, 1, 2, …, but are quite close to these
values.
121. What is the maximum number of lines per centimetera diffraction grating can have and produce a complete first-order spectrum for visible light?


122. Show that a diffraction grating cannot produce asecond-order maximum for a given wavelength of lightunless the first-order maximum is at an angle less than
30.0° .
123. A He-Ne laser beam is reflected from the surface of aCD onto a wall. The brightest spot is the reflected beam atan angle equal to the angle of incidence. However, fringesare also observed. If the wall is 1.50 m from the CD, andthe first fringe is 0.600 m from the central maximum, whatis the spacing of grooves on the CD?
124. Objects viewed through a microscope are placed veryclose to the focal point of the objective lens. Show that theminimum separation x of two objects resolvable through
the microscope is given by x = 1.22λ f0


D
,


where f0 is the focal length and D is the diameter of the
objective lens as shown below.


182 Chapter 4 | Diffraction


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




5 | RELATIVITY


Figure 5.1 Special relativity explains how time passes slightly differently on Earth and within the rapidly moving globalpositioning satellite (GPS). GPS units in vehicles could not find their correct location on Earth without taking this correction intoaccount. (credit: USAF)


Chapter Outline
5.1 Invariance of Physical Laws
5.2 Relativity of Simultaneity
5.3 Time Dilation
5.4 Length Contraction
5.5 The Lorentz Transformation
5.6 Relativistic Velocity Transformation
5.7 Doppler Effect for Light
5.8 Relativistic Momentum
5.9 Relativistic Energy


Introduction
The special theory of relativity was proposed in 1905 by Albert Einstein (1879–1955). It describes how time, space, andphysical phenomena appear in different frames of reference that are moving at constant velocity with respect to each other.This differs from Einstein’s later work on general relativity, which deals with any frame of reference, including acceleratedframes.
The theory of relativity led to a profound change in the way we perceive space and time. The “common sense” rules thatwe use to relate space and time measurements in the Newtonian worldview differ seriously from the correct rules at speedsnear the speed of light. For example, the special theory of relativity tells us that measurements of length and time intervalsare not the same in reference frames moving relative to one another. A particle might be observed to have a lifetime of
1.0 × 10−8 s in one reference frame, but a lifetime of 2.0 × 10−8 s in another; and an object might be measured to be
2.0 m long in one frame and 3.0 m long in another frame. These effects are usually significant only at speeds comparable tothe speed of light, but even at the much lower speeds of the global positioning satellite, which requires extremely accuratetime measurements to function, the different lengths of the same distance in different frames of reference are significant


Chapter 5 | Relativity 183




enough that they need to be taken into account.
Unlike Newtonian mechanics, which describes the motion of particles, or Maxwell's equations, which specify how theelectromagnetic field behaves, special relativity is not restricted to a particular type of phenomenon. Instead, its rules onspace and time affect all fundamental physical theories.
The modifications of Newtonian mechanics in special relativity do not invalidate classical Newtonian mechanics or requireits replacement. Instead, the equations of relativistic mechanics differ meaningfully from those of classical Newtonianmechanics only for objects moving at relativistic speeds (i.e., speeds less than, but comparable to, the speed of light). Inthe macroscopic world that you encounter in your daily life, the relativistic equations reduce to classical equations, andthe predictions of classical Newtonian mechanics agree closely enough with experimental results to disregard relativisticcorrections.
5.1 | Invariance of Physical Laws


Learning Objectives
By the end of this section, you will be able to:
• Describe the theoretical and experimental issues that Einstein’s theory of special relativityaddressed.
• State the two postulates of the special theory of relativity.


Suppose you calculate the hypotenuse of a right triangle given the base angles and adjacent sides. Whether you calculatethe hypotenuse from one of the sides and the cosine of the base angle, or from the Pythagorean theorem, the resultsshould agree. Predictions based on different principles of physics must also agree, whether we consider them principles ofmechanics or principles of electromagnetism.
Albert Einstein pondered a disagreement between predictions based on electromagnetism and on assumptions made inclassical mechanics. Specifically, suppose an observer measures the velocity of a light pulse in the observer’s own restframe; that is, in the frame of reference in which the observer is at rest. According to the assumptions long considered
obvious in classical mechanics, if an observer measures a velocity v→ in one frame of reference, and that frame of
reference is moving with velocity u→ past a second reference frame, an observer in the second frame measures the original
velocity as v′→ = v→ + u→ . This sum of velocities is often referred to as Galilean relativity. If this principle is correct,
the pulse of light that the observer measures as traveling with speed c travels at speed c + u measured in the frame of thesecond observer. If we reasonably assume that the laws of electrodynamics are the same in both frames of reference, then thepredicted speed of light (in vacuum) in both frames should be c = 1/ ε0 µ0. Each observer should measure the same speed
of the light pulse with respect to that observer’s own rest frame. To reconcile difficulties of this kind, Einstein constructedhis special theory of relativity, which introduced radical new ideas about time and space that have since been confirmedexperimentally.
Inertial Frames
All velocities are measured relative to some frame of reference. For example, a car’s motion is measured relative to itsstarting position on the road it travels on; a projectile’s motion is measured relative to the surface from which it is launched;and a planet’s orbital motion is measured relative to the star it orbits. The frames of reference in which mechanics takes thesimplest form are those that are not accelerating. Newton’s first law, the law of inertia, holds exactly in such a frame.


Inertial Reference Frame
An inertial frame of reference is a reference frame in which a body at rest remains at rest and a body in motion movesat a constant speed in a straight line unless acted upon by an outside force.


For example, to a passenger inside a plane flying at constant speed and constant altitude, physics seems to work exactlythe same as when the passenger is standing on the surface of Earth. When the plane is taking off, however, matters aresomewhat more complicated. In this case, the passenger at rest inside the plane concludes that a net force F on an objectis not equal to the product of mass and acceleration, ma. Instead, F is equal to ma plus a fictitious force. This situation is


184 Chapter 5 | Relativity


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




not as simple as in an inertial frame. The term “special” in “special relativity” refers to dealing only with inertial frames ofreference. Einstein’s later theory of general relativity deals with all kinds of reference frames, including accelerating, andtherefore non-inertial, reference frames.
Einstein’s First Postulate
Not only are the principles of classical mechanics simplest in inertial frames, but they are the same in all inertial frames.Einstein based the first postulate of his theory on the idea that this is true for all the laws of physics, not merely those inmechanics.


First Postulate of Special Relativity
The laws of physics are the same in all inertial frames of reference.


This postulate denies the existence of a special or preferred inertial frame. The laws of nature do not give us a way toendow any one inertial frame with special properties. For example, we cannot identify any inertial frame as being in a stateof “absolute rest.” We can only determine the relative motion of one frame with respect to another.
There is, however, more to this postulate than meets the eye. The laws of physics include only those that satisfy thispostulate. We will see that the definitions of energy and momentum must be altered to fit this postulate. Another outcome
of this postulate is the famous equation E = mc2, which relates energy to mass.
Einstein’s Second Postulate
The second postulate upon which Einstein based his theory of special relativity deals with the speed of light. Late in thenineteenth century, the major tenets of classical physics were well established. Two of the most important were the lawsof electromagnetism and Newton’s laws. Investigations such as Young’s double-slit experiment in the early 1800s hadconvincingly demonstrated that light is a wave. Maxwell’s equations of electromagnetism implied that electromagnetic
waves travel at c = 3.00×108 m/s in a vacuum, but they do not specify the frame of reference in which light has this
speed. Many types of waves were known, and all travelled in some medium. Scientists therefore assumed that some mediumcarried the light, even in a vacuum, and that light travels at a speed c relative to that medium (often called “the aether”).
Starting in the mid-1880s, the American physicist A.A. Michelson, later aided by E.W. Morley, made a series of directmeasurements of the speed of light. They intended to deduce from their data the speed v at which Earth was moving throughthe mysterious medium for light waves. The speed of light measured on Earth should have been c + v when Earth’s motionwas opposite to the medium’s flow at speed u past the Earth, and c – v when Earth was moving in the same direction as themedium. The results of their measurements were startling.


Michelson-Morley Experiment
The Michelson-Morley experiment demonstrated that the speed of light in a vacuum is independent of the motion ofEarth about the Sun.


The eventual conclusion derived from this result is that light, unlike mechanical waves such as sound, does not need amedium to carry it. Furthermore, the Michelson-Morley results implied that the speed of light c is independent of the motionof the source relative to the observer. That is, everyone observes light to move at speed c regardless of how they moverelative to the light source or to one another. For several years, many scientists tried unsuccessfully to explain these resultswithin the framework of Newton’s laws.
In addition, there was a contradiction between the principles of electromagnetism and the assumption made in Newton’slaws about relative velocity. Classically, the velocity of an object in one frame of reference and the velocity of that object ina second frame of reference relative to the first should combine like simple vectors to give the velocity seen in the secondframe. If that were correct, then two observers moving at different speeds would see light traveling at different speeds.Imagine what a light wave would look like to a person traveling along with it (in vacuum) at a speed c. If such a motionwere possible, then the wave would be stationary relative to the observer. It would have electric and magnetic fields whosestrengths varied with position but were constant in time. This is not allowed by Maxwell’s equations. So either Maxwell’sequations are different in different inertial frames, or an object with mass cannot travel at speed c. Einstein concluded thatthe latter is true: An object with mass cannot travel at speed c. Maxwell’s equations are correct, but Newton’s addition ofvelocities is not correct for light.


Chapter 5 | Relativity 185




5.1


Not until 1905, when Einstein published his first paper on special relativity, was the currently accepted conclusion reached.Based mostly on his analysis that the laws of electricity and magnetism would not allow another speed for light, and onlyslightly aware of the Michelson-Morley experiment, Einstein detailed his second postulate of special relativity.
Second Postulate of Special Relativity
Light travels in a vacuum with the same speed c in any direction in all inertial frames.


In other words, the speed of light has the same definite speed for any observer, regardless of the relative motion of thesource. This deceptively simple and counterintuitive postulate, along with the first postulate, leave all else open for change.Among the changes are the loss of agreement on the time between events, the variation of distance with speed, and therealization that matter and energy can be converted into one another. We describe these concepts in the following sections.
Check Your Understanding Explain how special relativity differs from general relativity.


5.2 | Relativity of Simultaneity
Learning Objectives


By the end of this section, you will be able to:
• Show from Einstein's postulates that two events measured as simultaneous in one inertialframe are not necessarily simultaneous in all inertial frames.
• Describe how simultaneity is a relative concept for observers in different inertial frames inrelative motion.


Do time intervals depend on who observes them? Intuitively, it seems that the time for a process, such as the elapsed timefor a foot race (Figure 5.2), should be the same for all observers. In everyday experiences, disagreements over elapsed timehave to do with the accuracy of measuring time. No one would be likely to argue that the actual time interval was differentfor the moving runner and for the stationary clock displayed. Carefully considering just how time is measured, however,shows that elapsed time does depends on the relative motion of an observer with respect to the process being measured.


186 Chapter 5 | Relativity


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 5.2 Elapsed time for a foot race is the same for all observers, but at relativistic speeds,elapsed time depends on the motion of the observer relative to the location where the processbeing timed occurs. (credit: "Jason Edward Scott Bain"/Flickr)


Consider how we measure elapsed time. If we use a stopwatch, for example, how do we know when to start and stop thewatch? One method is to use the arrival of light from the event. For example, if you’re in a moving car and observe the lightarriving from a traffic signal change from green to red, you know it’s time to step on the brake pedal. The timing is moreaccurate if some sort of electronic detection is used, avoiding human reaction times and other complications.
Now suppose two observers use this method to measure the time interval between two flashes of light from flash lampsthat are a distance apart (Figure 5.3). An observer A is seated midway on a rail car with two flash lamps at opposite sidesequidistant from her. A pulse of light is emitted from each flash lamp and moves toward observer A, shown in frame (a)of the figure. The rail car is moving rapidly in the direction indicated by the velocity vector in the diagram. An observer Bstanding on the platform is facing the rail car as it passes and observes both flashes of light reach him simultaneously, asshown in frame (c). He measures the distances from where he saw the pulses originate, finds them equal, and concludes thatthe pulses were emitted simultaneously.
However, because of Observer A’s motion, the pulse from the right of the railcar, from the direction the car is moving,reaches her before the pulse from the left, as shown in frame (b). She also measures the distances from within her frame ofreference, finds them equal, and concludes that the pulses were not emitted simultaneously.
The two observers reach conflicting conclusions about whether the two events at well-separated locations weresimultaneous. Both frames of reference are valid, and both conclusions are valid. Whether two events at separate locationsare simultaneous depends on the motion of the observer relative to the locations of the events.


Chapter 5 | Relativity 187




Figure 5.3 (a) Two pulses of light are emitted simultaneously relative to observer B. (c) The pulses reachobserver B’s position simultaneously. (b) Because of A’s motion, she sees the pulse from the right first andconcludes the bulbs did not flash simultaneously. Both conclusions are correct.


Here, the relative velocity between observers affects whether two events a distance apart are observed to be simultaneous.Simultaneity is not absolute. We might have guessed (incorrectly) that if light is emitted simultaneously, then two observershalfway between the sources would see the flashes simultaneously. But careful analysis shows this cannot be the case if thespeed of light is the same in all inertial frames.
This type of thought experiment (in German, “Gedankenexperiment”) shows that seemingly obvious conclusions must bechanged to agree with the postulates of relativity. The validity of thought experiments can only be determined by actualobservation, and careful experiments have repeatedly confirmed Einstein’s theory of relativity.


188 Chapter 5 | Relativity


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




5.3 | Time Dilation
Learning Objectives


By the end of this section, you will be able to:
• Explain how time intervals can be measured differently in different reference frames.
• Describe how to distinguish a proper time interval from a dilated time interval.
• Describe the significance of the muon experiment.
• Explain why the twin paradox is not a contradiction.
• Calculate time dilation given the speed of an object in a given frame.


The analysis of simultaneity shows that Einstein’s postulates imply an important effect: Time intervals have different valueswhen measured in different inertial frames. Suppose, for example, an astronaut measures the time it takes for a pulse oflight to travel a distance perpendicular to the direction of his ship’s motion (relative to an earthbound observer), bounce offa mirror, and return (Figure 5.4). How does the elapsed time that the astronaut measures in the spacecraft compare withthe elapsed time that an earthbound observer measures by observing what is happening in the spacecraft?
Examining this question leads to a profound result. The elapsed time for a process depends on which observer is measuringit. In this case, the time measured by the astronaut (within the spaceship where the astronaut is at rest) is smaller thanthe time measured by the earthbound observer (to whom the astronaut is moving). The time elapsed for the same processis different for the observers, because the distance the light pulse travels in the astronaut’s frame is smaller than in theearthbound frame, as seen in Figure 5.4. Light travels at the same speed in each frame, so it takes more time to travel thegreater distance in the earthbound frame.


Chapter 5 | Relativity 189




Figure 5.4 (a) An astronaut measures the time Δτ for light to travel distance 2D in the astronaut’s frame. (b) A NASA
scientist on Earth sees the light follow the longer path 2s and take a longer time Δt. (c) These triangles are used to find the
relationship between the two distances D and s.


Time Dilation
Time dilation is the lengthening of the time interval between two events for an observer in an inertial frame that ismoving with respect to the rest frame of the events (in which the events occur at the same location).


To quantitatively compare the time measurements in the two inertial frames, we can relate the distances in Figure 5.4 toeach other, then express each distance in terms of the time of travel (respectively either Δt or Δτ ) of the pulse in the
corresponding reference frame. The resulting equation can then be solved for Δt in terms of Δτ.
The lengths D and L in Figure 5.4 are the sides of a right triangle with hypotenuse s. From the Pythagorean theorem,


s2 = D2 + L2.


The lengths 2s and 2L are, respectively, the distances that the pulse of light and the spacecraft travel in time Δt in the
earthbound observer’s frame. The length D is the distance that the light pulse travels in time Δτ in the astronaut’s frame.
This gives us three equations:


2s = cΔt; 2L = vΔt; 2D = cΔτ.


190 Chapter 5 | Relativity


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Note that we used Einstein’s second postulate by taking the speed of light to be c in both inertial frames. We substitute theseresults into the previous expression from the Pythagorean theorem:
s2 = D2 + L2



⎝c
Δt
2



2
= ⎛⎝c


Δτ
2



2
+ ⎛⎝v


Δt
2



2
.


Then we rearrange to obtain
(cΔt)2 − (vΔt)2 = (cΔτ)2.


Finally, solving for Δt in terms of Δτ gives us


(5.1)Δt = Δτ
1 − (v/c)2


.


This is equivalent to
Δt = γΔτ,


where γ is the relativistic factor (often called the Lorentz factor) given by


(5.2)γ = 1
1 − v


2


c2


and v and c are the speeds of the moving observer and light, respectively.
Note the asymmetry between the two measurements. Only one of them is a measurement of the time interval between twoevents—the emission and arrival of the light pulse—at the same position. It is a measurement of the time interval in therest frame of a single clock. The measurement in the earthbound frame involves comparing the time interval between twoevents that occur at different locations. The time interval between events that occur at a single location has a separate nameto distinguish it from the time measured by the earthbound observer, and we use the separate symbol Δτ to refer to it
throughout this chapter.


Proper Time
The proper time interval Δτ between two events is the time interval measured by an observer for whom both events
occur at the same location.


The equation relating Δt and Δτ is truly remarkable. First, as stated earlier, elapsed time is not the same for different
observers moving relative to one another, even though both are in inertial frames. A proper time interval Δτ for an
observer who, like the astronaut, is moving with the apparatus, is smaller than the time interval for other observers. It is thesmallest possible measured time between two events. The earthbound observer sees time intervals within the moving systemas dilated (i.e., lengthened) relative to how the observer moving relative to Earth sees them within the moving system.Alternatively, according to the earthbound observer, less time passes between events within the moving frame. Note that theshortest elapsed time between events is in the inertial frame in which the observer sees the events (e.g., the emission andarrival of the light signal) occur at the same point.
This time effect is real and is not caused by inaccurate clocks or improper measurements. Time-interval measurements ofthe same event differ for observers in relative motion. The dilation of time is an intrinsic property of time itself. All clocksmoving relative to an observer, including biological clocks, such as a person’s heartbeat, or aging, are observed to run moreslowly compared with a clock that is stationary relative to the observer.


Chapter 5 | Relativity 191




Note that if the relative velocity is much less than the speed of light (v<<c), then v2 /c2 is extremely small, and the
elapsed times Δt and Δτ are nearly equal. At low velocities, physics based on modern relativity approaches classical
physics—everyday experiences involve very small relativistic effects. However, for speeds near the speed of light, v2 /c2
is close to one, so 1 − v2/c2 is very small and Δt becomes significantly larger than Δτ.
Half-Life of a Muon
There is considerable experimental evidence that the equation Δt = γΔτ is correct. One example is found in cosmic ray
particles that continuously rain down on Earth from deep space. Some collisions of these particles with nuclei in the upperatmosphere result in short-lived particles called muons. The half-life (amount of time for half of a material to decay) of amuon is 1.52 μs when it is at rest relative to the observer who measures the half-life. This is the proper time interval Δτ.
This short time allows very few muons to reach Earth’s surface and be detected if Newtonian assumptions about time andspace were correct. However, muons produced by cosmic ray particles have a range of velocities, with some moving nearthe speed of light. It has been found that the muon’s half-life as measured by an earthbound observer (Δt ) varies with
velocity exactly as predicted by the equation Δt = γΔτ. The faster the muon moves, the longer it lives. We on Earth see
the muon last much longer than its half-life predicts within its own rest frame. As viewed from our frame, the muon decaysmore slowly than it does when at rest relative to us. A far larger fraction of muons reach the ground as a result.
Before we present the first example of solving a problem in relativity, we state a strategy you can use as a guideline forthese calculations.


Problem-Solving Strategy: Relativity
1. Make a list of what is given or can be inferred from the problem as stated (identify the knowns). Look inparticular for information on relative velocity v.
2. Identify exactly what needs to be determined in the problem (identify the unknowns).
3. Make certain you understand the conceptual aspects of the problem before making any calculations (expressthe answer as an equation). Decide, for example, which observer sees time dilated or length contracted beforeworking with the equations or using them to carry out the calculation. If you have thought about who seeswhat, who is moving with the event being observed, who sees proper time, and so on, you will find it mucheasier to determine if your calculation is reasonable.
4. Determine the primary type of calculation to be done to find the unknowns identified above (do thecalculation). You will find the section summary helpful in determining whether a length contraction, relativistickinetic energy, or some other concept is involved.


Note that you should not round off during the calculation. As noted in the text, you must often perform your calculations tomany digits to see the desired effect. You may round off at the very end of the problem solution, but do not use a roundednumber in a subsequent calculation. Also, check the answer to see if it is reasonable: Does it make sense? This may be moredifficult for relativity, which has few everyday examples to provide experience with what is reasonable. But you can lookfor velocities greater than c or relativistic effects that are in the wrong direction (such as a time contraction where a dilationwas expected).


192 Chapter 5 | Relativity


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Example 5.1
Time Dilation in a High-Speed Vehicle
The Hypersonic Technology Vehicle 2 (HTV-2) is an experimental rocket vehicle capable of traveling at 21,000km/h (5830 m/s). If an electronic clock in the HTV-2 measures a time interval of exactly 1-s duration, what wouldobservers on Earth measure the time interval to be?
Strategy
Apply the time dilation formula to relate the proper time interval of the signal in HTV-2 to the time intervalmeasured on the ground.
Solution


a. Identify the knowns: Δτ = 1 s; v = 5830 m/s.
b. Identify the unknown: Δt.
c. Express the answer as an equation:


Δt = γΔτ = Δτ


1 − v
2


c2


.


d. Do the calculation. Use the expression for γ to determine Δt from Δτ :
Δt = 1 s


1 −



5830 m/s


3.00 × 108 m/s





2


= 1.000000000189 s


= 1 s + 1.89 × 10−10 s.


Significance
The very high speed of the HTV-2 is still only 10-5 times the speed of light. Relativistic effects for the HTV-2 arenegligible for almost all purposes, but are not zero.


Example 5.2
What Speeds are Relativistic?
How fast must a vehicle travel for 1 second of time measured on a passenger’s watch in the vehicle to differ by1% for an observer measuring it from the ground outside?
Strategy
Use the time dilation formula to find v/c for the given ratio of times.
Solutiona. Identify the known:


Δτ
Δt


= 1
1.01


.


b. Identify the unknown: v/c.
c. Express the answer as an equation:


Chapter 5 | Relativity 193




Δt = γΔτ = 1
1 − v2/c2


Δτ


Δτ
Δt


= 1 − v2/c2




Δτ
Δt



2
= 1 − v


2


c2


v
c = 1 − (Δτ/Δt)


2.


d. Do the calculation:
v
c = 1 − (1/1.01)


2


= 0.14.


Significance
The result shows that an object must travel at very roughly 10% of the speed of light for its motion to producesignificant relativistic time dilation effects.


Example 5.3
Calculating Δt for a Relativistic Event
Suppose a cosmic ray colliding with a nucleus in Earth’s upper atmosphere produces a muon that has a velocity
v = 0.950c. The muon then travels at constant velocity and lives 2.20 μs as measured in the muon’s frame of
reference. (You can imagine this as the muon’s internal clock.) How long does the muon live as measured by anearthbound observer (Figure 5.5)?


Figure 5.5 A muon in Earth’s atmosphere lives longer as measured by anearthbound observer than as measured by the muon’s internal clock.


As we will discuss later, in the muon’s reference frame, it travels a shorter distance than measured in Earth’sreference frame.
Strategy
A clock moving with the muon measures the proper time of its decay process, so the time we are given is
Δτ = 2.20µs. The earthbound observer measures Δt as given by the equation Δt = γΔτ. Because the velocity
is given, we can calculate the time in Earth’s frame of reference.


194 Chapter 5 | Relativity


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Solutiona. Identify the knowns: v = 0.950c, Δτ = 2.20µs.
b. Identify the unknown: Δt.
c. Express the answer as an equation. Use:


Δt = γΔτ


with
γ = 1


1 − v
2


c2


.


d. Do the calculation. Use the expression for γ to determine Δt from Δτ :
Δt = γΔτ


= 1


1 − v
2


c2


Δτ


=
2.20µs


1 − (0.950)2


= 7.05 µs.


Remember to keep extra significant figures until the final answer.
Significance
One implication of this example is that because γ = 3.20 at 95.0% of the speed of light (v = 0.950c), the
relativistic effects are significant. The two time intervals differ by a factor of 3.20, when classically they wouldbe the same. Something moving at 0.950c is said to be highly relativistic.


Example 5.4
Relativistic Television
A non-flat screen, older-style television display (Figure 5.6) works by accelerating electrons over a shortdistance to relativistic speed, and then using electromagnetic fields to control where the electron beam strikes a
fluorescent layer at the front of the tube. Suppose the electrons travel at 6.00 × 107 m/s through a distance of
0.200 m from the start of the beam to the screen. (a) What is the time of travel of an electron in the rest frame of
the television set? (b) What is the electron’s time of travel in its own rest frame?


Chapter 5 | Relativity 195




Figure 5.6 The electron beam in a cathode ray tube television display.


Strategy for (a)
(a) Calculate the time from vt = d. Even though the speed is relativistic, the calculation is entirely in one frame
of reference, and relativity is therefore not involved.
Solutiona. Identify the knowns:


v = 6.00 × 107m/s; d = 0.200 m.


b. Identify the unknown: the time of travel Δt.
c. Express the answer as an equation:


Δt = dv .


d. Do the calculation:
t = 0.200 m


6.00 × 107 m/s


= 3.33 × 10−9 s.


Significance
The time of travel is extremely short, as expected. Because the calculation is entirely within a single frame ofreference, relativity is not involved, even though the electron speed is close to c.
Strategy for (b)
(b) In the frame of reference of the electron, the vacuum tube is moving and the electron is stationary. Theelectron-emitting cathode leaves the electron and the front of the vacuum tube strikes the electron with theelectron at the same location. Therefore we use the time dilation formula to relate the proper time in the electronrest frame to the time in the television frame.


196 Chapter 5 | Relativity


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




5.2


Solutiona. Identify the knowns (from part a):
Δt = 3.33 × 10−9 s; v = 6.00 × 107 m/s; d = 0.200 m.


b. Identify the unknown: τ.
c. Express the answer as an equation:


Δt = γΔτ = Δτ
1 − v2/c2


Δτ = Δt 1 − v2/c2.
d. Do the calculation:


Δτ = ⎛⎝3.33 × 10
−9 s⎞⎠ 1 −




6.00 × 107m/s
3.00 × 108 m/s





2


= 3.26 × 10−9 s.


Significance
The time of travel is shorter in the electron frame of reference. Because the problem requires finding the timeinterval measured in different reference frames for the same process, relativity is involved. If we had tried tocalculate the time in the electron rest frame by simply dividing the 0.200 m by the speed, the result would beslightly incorrect because of the relativistic speed of the electron.


Check Your Understanding What is γ if v = 0.650c?


The Twin Paradox
An intriguing consequence of time dilation is that a space traveler moving at a high velocity relative to Earth would ageless than the astronaut’s earthbound twin. This is often known as the twin paradox. Imagine the astronaut moving at such avelocity that γ = 30.0, as in Figure 5.7. A trip that takes 2.00 years in her frame would take 60.0 years in the earthbound
twin’s frame. Suppose the astronaut travels 1.00 year to another star system, briefly explores the area, and then travels 1.00year back. An astronaut who was 40 years old at the start of the trip would be would be 42 when the spaceship returns.Everything on Earth, however, would have aged 60.0 years. The earthbound twin, if still alive, would be 100 years old.
The situation would seem different to the astronaut in Figure 5.7. Because motion is relative, the spaceship would seem tobe stationary and Earth would appear to move. (This is the sensation you have when flying in a jet.) Looking out the windowof the spaceship, the astronaut would see time slow down on Earth by a factor of γ = 30.0. Seen from the spaceship, the
earthbound sibling will have aged only 2/30, or 0.07, of a year, whereas the astronaut would have aged 2.00 years.


Chapter 5 | Relativity 197




5.3


Figure 5.7 The twin paradox consists of the conflictingconclusions about which twin ages more as a result of a longspace journey at relativistic speed.


The paradox here is that the two twins cannot both be correct. As with all paradoxes, conflicting conclusions come froma false premise. In fact, the astronaut’s motion is significantly different from that of the earthbound twin. The astronautaccelerates to a high velocity and then decelerates to view the star system. To return to Earth, she again accelerates anddecelerates. The spacecraft is not in a single inertial frame to which the time dilation formula can be directly applied. Thatis, the astronaut twin changes inertial references. The earthbound twin does not experience these accelerations and remainsin the same inertial frame. Thus, the situation is not symmetric, and it is incorrect to claim that the astronaut observes thesame effects as her twin. The lack of symmetry between the twins will be still more evident when we analyze the journeylater in this chapter in terms of the path the astronaut follows through four-dimensional space-time.
In 1971, American physicists Joseph Hafele and Richard Keating verified time dilation at low relative velocities by flyingextremely accurate atomic clocks around the world on commercial aircraft. They measured elapsed time to an accuracy ofa few nanoseconds and compared it with the time measured by clocks left behind. Hafele and Keating’s results were withinexperimental uncertainties of the predictions of relativity. Both special and general relativity had to be taken into account,because gravity and accelerations were involved as well as relative motion.


Check Your Understanding a. A particle travels at 1.90 × 108m/s and lives 2.10 × 10−8 s when at
rest relative to an observer. How long does the particle live as viewed in the laboratory?
b. Spacecraft A and B pass in opposite directions at a relative speed of 4.00 × 107m/s. An internal clock in
spacecraft A causes it to emit a radio signal for 1.00 s. The computer in spacecraft B corrects for the beginningand end of the signal having traveled different distances, to calculate the time interval during which ship A wasemitting the signal. What is the time interval that the computer in spacecraft B calculates?


198 Chapter 5 | Relativity


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




5.4 | Length Contraction
Learning Objectives


By the end of this section, you will be able to:
• Explain how simultaneity and length contraction are related.
• Describe the relation between length contraction and time dilation and use it to derive thelength-contraction equation.


The length of the train car in Figure 5.8 is the same for all the passengers. All of them would agree on the simultaneouslocation of the two ends of the car and obtain the same result for the distance between them. But simultaneous events in oneinertial frame need not be simultaneous in another. If the train could travel at relativistic speeds, an observer on the groundwould see the simultaneous locations of the two endpoints of the car at a different distance apart than observers inside thecar. Measured distances need not be the same for different observers when relativistic speeds are involved.


Figure 5.8 People might describe distances differently, but atrelativistic speeds, the distances really are different. (credit:“russavia”/Flickr)
Proper Length
Two observers passing each other always see the same value of their relative speed. Even though time dilation implies thatthe train passenger and the observer standing alongside the tracks measure different times for the train to pass, they stillagree that relative speed, which is distance divided by elapsed time, is the same. If an observer on the ground and one on thetrain measure a different time for the length of the train to pass the ground observer, agreeing on their relative speed meansthey must also see different distances traveled.
The muon discussed in Example 5.3 illustrates this concept (Figure 5.9). To an observer on Earth, the muon travels at0.950c for 7.05 μs from the time it is produced until it decays. Therefore, it travels a distance relative to Earth of:


L0 = vΔt = (0.950)(3.00 × 10
8 m/s)(7.05 × 10−6 s⎞⎠ = 2.01 km.


In the muon frame, the lifetime of the muon is 2.20 μs. In this frame of reference, the Earth, air, and ground have onlyenough time to travel:
L = vΔτ = (0.950)(3.00 × 108 m/s)(2.20 × 10−6 s) km = 0.627 km.


The distance between the same two events (production and decay of a muon) depends on who measures it and how they aremoving relative to it.


Chapter 5 | Relativity 199




Proper Length
Proper length L0 is the distance between two points measured by an observer who is at rest relative to both of the
points.


The earthbound observer measures the proper length L0 because the points at which the muon is produced and decays are
stationary relative to Earth. To the muon, Earth, air, and clouds are moving, so the distance L it sees is not the proper length.


Figure 5.9 (a) The earthbound observer sees the muon travel 2.01 km. (b) The same path has length 0.627 km seen from themuon’s frame of reference. The Earth, air, and clouds are moving relative to the muon in its frame, and have smaller lengthsalong the direction of travel.
Length Contraction
To relate distances measured by different observers, note that the velocity relative to the earthbound observer in our muonexample is given by


v =
L0
Δt


.


The time relative to the earthbound observer is Δt, because the object being timed is moving relative to this observer. The
velocity relative to the moving observer is given by


v = L
Δτ


.


The moving observer travels with the muon and therefore observes the proper time Δτ. The two velocities are identical;
thus,


L0
Δt


= L
Δτ


.


We know that Δt = γΔτ. Substituting this equation into the relationship above gives
(5.3)


L =
L0
γ .


Substituting for γ gives an equation relating the distances measured by different observers.


Length Contraction
Length contraction is the decrease in the measured length of an object from its proper length when measured in areference frame that is moving with respect to the object:


(5.4)
L = L0 1 −


v2


c2


where L0 is the length of the object in its rest frame, and L is the length in the frame moving with velocity v.


If we measure the length of anything moving relative to our frame, we find its length L to be smaller than the proper length


200 Chapter 5 | Relativity


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




L0 that would be measured if the object were stationary. For example, in the muon’s rest frame, the distance Earth moves
between where the muon was produced and where it decayed is shorter than the distance traveled as seen from the Earth’sframe. Those points are fixed relative to Earth but are moving relative to the muon. Clouds and other objects are alsocontracted along the direction of motion as seen from muon’s rest frame.
Thus, two observers measure different distances along their direction of relative motion, depending on which one ismeasuring distances between objects at rest.
But what about distances measured in a direction perpendicular to the relative motion? Imagine two observers moving alongtheir x-axes and passing each other while holding meter sticks vertically in the y-direction. Figure 5.10 shows two metersticks M and M′ that are at rest in the reference frames of two boys S and S′, respectively. A small paintbrush is attached
to the top (the 100-cm mark) of stick M′. Suppose that S′ is moving to the right at a very high speed v relative to S, and the
sticks are oriented so that they are perpendicular, or transverse, to their relative velocity vector. The sticks are held so thatas they pass each other, their lower ends (the 0-cm marks) coincide. Assume that when S looks at his stick M afterwards,he finds a line painted on it, just below the top of the stick. Because the brush is attached to the top of the other boy’s stick
M′, S can only conclude that stick M′ is less than 1.0 m long.


Figure 5.10 Meter sticks M and M′ are stationary in the reference
frames of observers S and S′, respectively. As the sticks pass, a
small brush attached to the 100-cm mark of M′ paints a line on M.


Now when the boys approach each other, S′, like S, sees a meter stick moving toward him with speed v. Because their
situations are symmetric, each boy must make the same measurement of the stick in the other frame. So, if S measures stick
M′ to be less than 1.0 m long, S′ must measure stick M to be also less than 1.0 m long, and S′ must see his paintbrush
pass over the top of stick M and not paint a line on it. In other words, after the same event, one boy sees a painted line on astick, while the other does not see such a line on that same stick!
Einstein’s first postulate requires that the laws of physics (as, for example, applied to painting) predict that S and S′, who
are both in inertial frames, make the same observations; that is, S and S′ must either both see a line painted on stick M, or
both not see that line. We are therefore forced to conclude our original assumption that S saw a line painted below the topof his stick was wrong! Instead, S finds the line painted right at the 100-cm mark on M. Then both boys will agree that aline is painted on M, and they will also agree that both sticks are exactly 1 m long. We conclude then that measurements ofa transverse length must be the same in different inertial frames.
Example 5.5


Calculating Length Contraction
Suppose an astronaut, such as the twin in the twin paradox discussion, travels so fast that γ = 30.00. (a) The
astronaut travels from Earth to the nearest star system, Alpha Centauri, 4.300 light years (ly) away as measuredby an earthbound observer. How far apart are Earth and Alpha Centauri as measured by the astronaut? (b) In termsof c, what is the astronaut’s velocity relative to Earth? You may neglect the motion of Earth relative to the sun(Figure 5.11).


Chapter 5 | Relativity 201




Figure 5.11 (a) The earthbound observer measures the proper distance between Earth and Alpha Centauri. (b) Theastronaut observes a length contraction because Earth and Alpha Centauri move relative to her ship. She can travel thisshorter distance in a smaller time (her proper time) without exceeding the speed of light.


Strategy
First, note that a light year (ly) is a convenient unit of distance on an astronomical scale—it is the distance lighttravels in a year. For part (a), the 4.300-ly distance between Alpha Centauri and Earth is the proper distance
L0, because it is measured by an earthbound observer to whom both stars are (approximately) stationary. To
the astronaut, Earth and Alpha Centauri are moving past at the same velocity, so the distance between them is thecontracted length L. In part (b), we are given γ, so we can find v by rearranging the definition of γ to express v
in terms of c.
Solution for (a)
For part (a):


a. Identify the knowns: L0 = 4.300 ly; γ = 30.00.
b. Identify the unknown: L.
c. Express the answer as an equation: L = L0γ .
d. Do the calculation:


L =
L0
γ


=
4.300 ly
30.00


= 0.1433 ly.


202 Chapter 5 | Relativity


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Solution for (b)
For part (b):


a. Identify the known: γ = 30.00.
b. Identify the unknown: v in terms of c.
c. Express the answer as an equation. Start with:


γ = 1


1 − v
2


c2


.


Then solve for the unknown v/c by first squaring both sides and then rearranging:
γ2 = 1


1 − v
2


c2


v2


c2
= 1 − 1


γ2


v
c = 1 −


1
γ2


.


d. Do the calculation:
v
c = 1 −


1
γ2


= 1 − 1
(30.00)2


= 0.99944


or
v = 0.9994 c.


Significance
Remember not to round off calculations until the final answer, or you could get erroneous results. This isespecially true for special relativity calculations, where the differences might only be revealed after severaldecimal places. The relativistic effect is large here ⎛⎝γ = 30.00⎞⎠, and we see that v is approaching (not equaling)
the speed of light. Because the distance as measured by the astronaut is so much smaller, the astronaut can travelit in much less time in her frame.


People traveling at extremely high velocities could cover very large distances (thousands or even millions of light years) andage only a few years on the way. However, like emigrants in past centuries who left their home, these people would leave theEarth they know forever. Even if they returned, thousands to millions of years would have passed on Earth, obliterating mostof what now exists. There is also a more serious practical obstacle to traveling at such velocities; immensely greater energieswould be needed to achieve such high velocities than classical physics predicts can be attained. This will be discussed laterin the chapter.
Why don’t we notice length contraction in everyday life? The distance to the grocery store does not seem to depend on
whether we are moving or not. Examining the equation L = L0 1 − v2


c2
, we see that at low velocities (v<<c), the


lengths are nearly equal, which is the classical expectation. But length contraction is real, if not commonly experienced. Forexample, a charged particle such as an electron traveling at relativistic velocity has electric field lines that are compressedalong the direction of motion as seen by a stationary observer (Figure 5.12). As the electron passes a detector, such as acoil of wire, its field interacts much more briefly, an effect observed at particle accelerators such as the 3-km-long StanfordLinear Accelerator (SLAC). In fact, to an electron traveling down the beam pipe at SLAC, the accelerator and Earth are allmoving by and are length contracted. The relativistic effect is so great that the accelerator is only 0.5 m long to the electron.It is actually easier to get the electron beam down the pipe, because the beam does not have to be as precisely aimed to get


Chapter 5 | Relativity 203




5.4


down a short pipe as it would to get down a pipe 3 km long. This, again, is an experimental verification of the special theoryof relativity.


Figure 5.12 The electric field lines of a high-velocity chargedparticle are compressed along the direction of motion by lengthcontraction, producing an observably different signal as theparticle goes through a coil.
Check Your Understanding A particle is traveling through Earth’s atmosphere at a speed of 0.750c. Toan earthbound observer, the distance it travels is 2.50 km. How far does the particle travel as viewed from theparticle’s reference frame?


5.5 | The Lorentz Transformation
Learning Objectives


• Describe the Galilean transformation of classical mechanics, relating the position, time,velocities, and accelerations measured in different inertial frames
• Derive the corresponding Lorentz transformation equations, which, in contrast to the Galileantransformation, are consistent with special relativity
• Explain the Lorentz transformation and many of the features of relativity in terms of four-dimensional space-time


We have used the postulates of relativity to examine, in particular examples, how observers in different frames of referencemeasure different values for lengths and the time intervals. We can gain further insight into how the postulates of relativitychange the Newtonian view of time and space by examining the transformation equations that give the space and timecoordinates of events in one inertial reference frame in terms of those in another. We first examine how position andtime coordinates transform between inertial frames according to the view in Newtonian physics. Then we examine howthis has to be changed to agree with the postulates of relativity. Finally, we examine the resulting Lorentz transformationequations and some of their consequences in terms of four-dimensional space-time diagrams, to support the view that theconsequences of special relativity result from the properties of time and space itself, rather than electromagnetism.
The Galilean Transformation Equations
An event is specified by its location and time (x, y, z, t) relative to one particular inertial frame of reference S. As anexample, (x, y, z, t) could denote the position of a particle at time t, and we could be looking at these positions for manydifferent times to follow the motion of the particle. Suppose a second frame of reference S′ moves with velocity v with
respect to the first. For simplicity, assume this relative velocity is along the x-axis. The relation between the time andcoordinates in the two frames of reference is then


x = x′ + vt, y = y′, z = z′.


Implicit in these equations is the assumption that time measurements made by observers in both S and S′ are the same.
That is,


t = t′.


204 Chapter 5 | Relativity


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




These four equations are known collectively as the Galilean transformation.
We can obtain the Galilean velocity and acceleration transformation equations by differentiating these equations withrespect to time. We use u for the velocity of a particle throughout this chapter to distinguish it from v, the relative velocityof two reference frames. Note that, for the Galilean transformation, the increment of time used in differentiating to calculatethe particle velocity is the same in both frames, dt = dt′. Differentiation yields


ux = ux′ + v, uy = uy′ , uz = uz′


and
ax = ax′ , ay = ay′ , az = az′ .


We denote the velocity of the particle by u rather than v to avoid confusion with the velocity v of one frame of referencewith respect to the other. Velocities in each frame differ by the velocity that one frame has as seen from the other frame.Observers in both frames of reference measure the same value of the acceleration. Because the mass is unchanged by thetransformation, and distances between points are uncharged, observers in both frames see the same forces F = ma acting
between objects and the same form of Newton’s second and third laws in all inertial frames. The laws of mechanics areconsistent with the first postulate of relativity.
The Lorentz Transformation Equations
The Galilean transformation nevertheless violates Einstein’s postulates, because the velocity equations state that a pulse oflight moving with speed c along the x-axis would travel at speed c − v in the other inertial frame. Specifically, the spherical
pulse has radius r = ct at time t in the unprimed frame, and also has radius r′ = ct′ at time t′ in the primed frame.
Expressing these relations in Cartesian coordinates gives


x2 + y2 + z2 − c2 t2 = 0


x′2 + y′2 + z′2 − c2 t′2 = 0.


The left-hand sides of the two expressions can be set equal because both are zero. Because y = y′ and z = z′, we obtain
(5.5)x2 − c2 t2 = x′2 − c2 t′2.


This cannot be satisfied for nonzero relative velocity v of the two frames if we assume the Galilean transformation resultsin t = t′ with x = x′ + vt′.
To find the correct set of transformation equations, assume the two coordinate systems S and S′ in Figure 5.13. First
suppose that an event occurs at (x′, 0, 0, t′) in S′ and at (x, 0, 0, t) in S, as depicted in the figure.


Figure 5.13 An event occurs at (x, 0, 0, t) in S and at
(x′, 0, 0, t′) in S′. The Lorentz transformation equations relate
events in the two systems.


Suppose that at the instant that the origins of the coordinate systems in S and S′ coincide, a flash bulb emits a spherically
spreading pulse of light starting from the origin. At time t, an observer in S finds the origin of S′ to be at x = vt. With
the help of a friend in S, the S′ observer also measures the distance from the event to the origin of S′ and finds it to be


Chapter 5 | Relativity 205




x′ 1 − v2 /c2. This follows because we have already shown the postulates of relativity to imply length contraction. Thus
the position of the event in S is


x = vt + x′ 1 − v2 /c2


and
x′ = x − vt


1 − v2 /c2
.


The postulates of relativity imply that the equation relating distance and time of the spherical wave front:
x2 + y2 + z2 − c2 t2 = 0


must apply both in terms of primed and unprimed coordinates, which was shown above to lead to Equation 5.5:
x2 − c2 t2 = x′2 − c2 t′2.


We combine this with the equation relating x and x′ to obtain the relation between t and t′ :
t′ = t − vx/c


2


1 − v2 /c2
.


The equations relating the time and position of the events as seen in S are then
t = t′ + vx′/c


2


1 − v2 /c2


x = x′ + vt′


1 − v2 /c2


y = y′
z = z′.


This set of equations, relating the position and time in the two inertial frames, is known as the Lorentz transformation.They are named in honor of H.A. Lorentz (1853–1928), who first proposed them. Interestingly, he justified thetransformation on what was eventually discovered to be a fallacious hypothesis. The correct theoretical basis is Einstein’sspecial theory of relativity.
The reverse transformation expresses the variables in S in terms of those in S′. Simply interchanging the primed and
unprimed variables and substituting gives:


t′ = t − vx/c
2


1 − v2 /c2


x′ = x − vt


1 − v2 /c2
y′ = y
z′ = z.


Example 5.6
Using the Lorentz Transformation for Time
Spacecraft S′ is on its way to Alpha Centauri when Spacecraft S passes it at relative speed c/2. The captain of
S′ sends a radio signal that lasts 1.2 s according to that ship’s clock. Use the Lorentz transformation to find the
time interval of the signal measured by the communications officer of spaceship S.
Solutiona. Identify the known: Δt′ = t2′ − t1′ = 1.2 s; Δx′ = x′2 − x′1 = 0.


b. Identify the unknown: Δt = t2 − t1.


206 Chapter 5 | Relativity


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




c. Express the answer as an equation. The time signal starts as ⎛⎝x′, t1′⎞⎠ and stops at ⎛⎝x′, t2′⎞⎠. Note
that the x′ coordinate of both events is the same because the clock is at rest in S′. Write the first
Lorentz transformation equation in terms of Δt = t2 − t1, Δx = x2 − x1, and similarly for the primed
coordinates, as:


Δt = Δt′ + vΔx′/c
2


1 − v
2


c2


.


Because the position of the clock in S′ is fixed, Δx′ = 0, and the time interval Δt becomes:
Δt = Δt′


1 − v
2


c2


.


d. Do the calculation.With Δt′ = 1.2 s this gives:
Δt = 1.2 s


1 − ⎛⎝
1
2



2
= 1.6 s.


Note that the Lorentz transformation reproduces the time dilation equation.


Example 5.7
Using the Lorentz Transformation for Length
A surveyor measures a street to be L = 100 m long in Earth frame S. Use the Lorentz transformation to obtain an
expression for its length measured from a spaceship S′, moving by at speed 0.20c, assuming the x coordinates
of the two frames coincide at time t = 0.
Solutiona. Identify the known: L = 100 m; v = 0.20c; Δτ = 0.


b. Identify the unknown: L′.
c. Express the answer as an equation. The surveyor in frame S has measured the two ends of the sticksimultaneously, and found them at rest at x2 and x1 a distance L = x2 − x1 = 100 m apart. The


spaceship crew measures the simultaneous location of the ends of the sticks in their frame. To relatethe lengths recorded by observers in S′ and S, respectively, write the second of the four Lorentz
transformation equations as:


x′2 − x′1 =
x2 − vt


1 − v2 /c2


x1 − vt


1 − v2 /c2


=
x2 − x1
1 − v2 /c2


= L
1 − v2 /c2


.


d. Do the calculation. Because x′2 − x′1 = 100 m, the length of the moving stick is equal to:
L′ = (100 m) 1 − v2/c2


= (100 m) 1 − (0.20)2


= 98.0 m.


Note that the Lorentz transformation gave the length contraction equation for the street.


Chapter 5 | Relativity 207




Example 5.8
Lorentz Transformation and Simultaneity
The observer shown in Figure 5.14 standing by the railroad tracks sees the two bulbs flash simultaneously atboth ends of the 26 m long passenger car when the middle of the car passes him at a speed of c/2. Find theseparation in time between when the bulbs flashed as seen by the train passenger seated in the middle of the car.


Figure 5.14 An person watching a train go by observes two bulbs flash simultaneously at opposite endsof a passenger car. There is another passenger inside of the car observing the same flashes but from adifferent perspective.


Solutiona. Identify the known: Δt = 0.
Note that the spatial separation of the two events is between the two lamps, not the distance of the lampto the passenger.


b. Identify the unknown: Δt′ = t2′ − t1′ .
Again, note that the time interval is between the flashes of the lamps, not between arrival times forreaching the passenger.


c. Express the answer as an equation:
Δt = Δt′ + vΔx′/c


2


1 − v2 /c2
.


d. Do the calculation:
0 =


Δt′ + c
2
(26 m)/c2


1 − v2 /c2


Δt′ = −26 m/s
2c


= − 26 m/s
2⎛⎝3.00×10


8 m/s⎞⎠


Δt′ = −4.33×10−8 s.


Significance
The sign indicates that the event with the larger x2′, namely, the flash from the right, is seen to occur first in the
S′ frame, as found earlier for this example, so that t2 < t1.


Space-time
Relativistic phenomena can be analyzed in terms of events in a four-dimensional space-time. When phenomena such as thetwin paradox, time dilation, length contraction, and the dependence of simultaneity on relative motion are viewed in thisway, they are seen to be characteristic of the nature of space and time, rather than specific aspects of electromagnetism.
In three-dimensional space, positions are specified by three coordinates on a set of Cartesian axes, and the displacement ofone point from another is given by:


208 Chapter 5 | Relativity


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4





⎝Δx, Δy, Δz⎞⎠ = (x2 − x1, y2 − y1, z2 − z1).


The distance Δr between the points is
Δr2 = (Δx)2 + ⎛⎝Δy⎞⎠2 + (Δz)2.


The distance Δr is invariant under a rotation of axes. If a new set of Cartesian axes rotated around the origin relative to the
original axes are used, each point in space will have new coordinates in terms of the new axes, but the distance Δr′ given
by


Δr′2 = (Δx′)2 + ⎛⎝Δy′⎞⎠2 + (Δz′)2.


That has the same value that Δr2 had. Something similar happens with the Lorentz transformation in space-time.
Define the separation between two events, each given by a set of x, y, z¸ and ct along a four-dimensional Cartesian systemof axes in space-time, as



⎝Δx, Δy, Δz, cΔt⎞⎠ = ⎛⎝x2 − x1, y2 − y1, z2 − z1, c(t2 − t1)



⎠.


Also define the space-time interval Δs between the two events as
Δs2 = (Δx)2 + ⎛⎝Δy⎞⎠2 + (Δz)2 − (cΔt)2.


If the two events have the same value of ct in the frame of reference considered, Δs would correspond to the distance Δr
between points in space.
The path of a particle through space-time consists of the events (x, y, z¸ ct) specifying a location at each time of its motion.The path through space-time is called the world line of the particle. The world line of a particle that remains at rest atthe same location is a straight line that is parallel to the time axis. If the particle moves at constant velocity parallel to thex-axis, its world line would be a sloped line x = vt, corresponding to a simple displacement vs. time graph. If the particle
accelerates, its world line is curved. The increment of s along the world line of the particle is given in differential form as


ds2 = (dx)2 + ⎛⎝dy⎞⎠2 + (dz)2 − c2 (dt)2.


Just as the distance Δr is invariant under rotation of the space axes, the space-time interval:
Δs2 = (Δx)2 + ⎛⎝Δy⎞⎠2 + (Δz)2 − (cΔt)2.


is invariant under the Lorentz transformation. This follows from the postulates of relativity, and can be seen also bysubstitution of the previous Lorentz transformation equations into the expression for the space-time interval:
Δs2 = (Δx)2 + ⎛⎝Δy⎞⎠2 + (Δz)2 − (cΔt)2


=



⎜Δx′ + vΔt′


1 − v2 /c2






2


+ ⎛⎝Δy′⎞⎠2 + (Δz′)2 −






⎜c


Δt′ + vΔx′
c2


1 − v2 /c2







2


= (Δx′)2 + ⎛⎝Δy′⎞⎠2 + (Δz′)2 − (cΔt′)2


= Δs′2.


In addition, the Lorentz transformation changes the coordinates of an event in time and space similarly to how a three-dimensional rotation changes old coordinates into new coordinates:
Lorentz transformation Axis – rotation around z-axis
(x, t coordinates): (x, y coordinates):
x′ = (γ)x + ⎛⎝−βγ⎞⎠ct x′ = (cos θ)x + (sin θ)y


ct′ = (−βγ)x + (γ)ct y′ = (−sin θ)x + (cos θ)y


where γ = 1
1 − β2


; β = v/c.


Lorentz transformations can be regarded as generalizations of spatial rotations to space-time. However, there are somedifferences between a three-dimensional axis rotation and a Lorentz transformation involving the time axis, because of


Chapter 5 | Relativity 209




5.5


differences in how the metric, or rule for measuring the displacements Δr and Δs, differ. Although Δr is invariant under
spatial rotations and Δs is invariant also under Lorentz transformation, the Lorentz transformation involving the time axis
does not preserve some features, such as the axes remaining perpendicular or the length scale along each axis remaining thesame.
Note that the quantity Δs2 can have either sign, depending on the coordinates of the space-time events involved. For pairs
of events that give it a negative sign, it is useful to define Δτ2 as −Δs2. The significance of Δτ as just defined follows
by noting that in a frame of reference where the two events occur at the same location, we have Δx = Δy = Δz = 0 and
therefore (from the equation for Δs2 = − Δτ2⎞⎠:


Δτ2 = − Δs2 = (Δt)2.


Therefore Δτ is the time interval Δt in the frame of reference where both events occur at the same location. It is the same
interval of proper time discussed earlier. It also follows from the relation between Δs and that Δτ that because Δs is
Lorentz invariant, the proper time is also Lorentz invariant. All observers in all inertial frames agree on the proper timeintervals between the same two events.


Check Your Understanding Show that if a time increment dt elapses for an observer who sees theparticle moving with velocity v, it corresponds to a proper time particle increment for the particle of dτ = γdt.


The light cone
We can deal with the difficulty of visualizing and sketching graphs in four dimensions by imagining the three spatialcoordinates to be represented collectively by a horizontal axis, and the vertical axis to be the ct-axis. Starting with aparticular event in space-time as the origin of the space-time graph shown, the world line of a particle that remains at rest atthe initial location of the event at the origin then is the time axis. Any plane through the time axis parallel to the spatial axescontains all the events that are simultaneous with each other and with the intersection of the plane and the time axis, as seenin the rest frame of the event at the origin.
It is useful to picture a light cone on the graph, formed by the world lines of all light beams passing through the origin eventA, as shown in Figure 5.15. The light cone, according to the postulates of relativity, has sides at an angle of 45° if the time
axis is measured in units of ct, and, according to the postulates of relativity, the light cone remains the same in all inertialframes. Because the event A is arbitrary, every point in the space-time diagram has a light cone associated with it.


Figure 5.15 The light cone consists of all the world linesfollowed by light from the event A at the vertex of the cone.


Consider now the world line of a particle through space-time. Any world line outside of the cone, such as one passing fromA through C, would involve speeds greater than c, and would therefore not be possible. Events such as C that lie outside the


210 Chapter 5 | Relativity


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




light cone are said to have a space-like separation from event A. They are characterized by:
ΔsAC


2 = (xA − xB)
2 + (xA − xB)


2 + (xA − xB)
2 − (cΔt)2 > 0.


An event like B that lies in the upper cone is reachable without exceeding the speed of light in vacuum, and is characterizedby
ΔsAB


2 = (xA − xB)
2 + (xA − xB)


2 + (xA − xB)
2 − (cΔt)2 < 0.


The event is said to have a time-like separation from A. Time-like events that fall into the upper half of the light cone occurat greater values of t than the time of the event A at the vertex and are in the future relative to A. Events that have time-likeseparation from A and fall in the lower half of the light cone are in the past, and can affect the event at the origin. The regionoutside the light cone is labeled as neither past nor future, but rather as “elsewhere.”
For any event that has a space-like separation from the event at the origin, it is possible to choose a time axis that willmake the two events occur at the same time, so that the two events are simultaneous in some frame of reference. Therefore,which of the events with space-like separation comes before the other in time also depends on the frame of reference ofthe observer. Since space-like separations can be traversed only by exceeding the speed of light; this violation of whichevent can cause the other provides another argument for why particles cannot travel faster than the speed of light, as well aspotential material for science fiction about time travel. Similarly for any event with time-like separation from the event atthe origin, a frame of reference can be found that will make the events occur at the same location. Because the relations


ΔsAC
2 = (xA − xB)


2 + (xA − xB)
2 + (xA − xB)


2 − (cΔt)2 > 0


and
ΔsAB


2 = (xA − xB)
2 + (xA − xB)


2 + (xA − xB)
2 − (cΔt)2 < 0.


are Lorentz invariant, whether two events are time-like and can be made to occur at the same place or space-like and can bemade to occur at the same time is the same for all observers. All observers in different inertial frames of reference agree onwhether two events have a time-like or space-like separation.
The twin paradox seen in space-time
The twin paradox discussed earlier involves an astronaut twin traveling at near light speed to a distant star system, andreturning to Earth. Because of time dilation, the space twin is predicted to age much less than the earthbound twin. Thisseems paradoxical because we might have expected at first glance for the relative motion to be symmetrical and naivelythought it possible to also argue that the earthbound twin should age less.
To analyze this in terms of a space-time diagram, assume that the origin of the axes used is fixed in Earth. The world line ofthe earthbound twin is then along the time axis.
The world line of the astronaut twin, who travels to the distant star and then returns, must deviate from a straight line pathin order to allow a return trip. As seen in Figure 5.16, the circumstances of the two twins are not at all symmetrical. Theirpaths in space-time are of manifestly different length. Specifically, the world line of the earthbound twin has length 2cΔt,
which then gives the proper time that elapses for the earthbound twin as 2Δt. The distance to the distant star system is
Δx = vΔt. The proper time that elapses for the space twin is 2Δτ where


c2Δτ2 = − Δs2 = (cΔt)2 − (Δx)2.


This is considerably shorter than the proper time for the earthbound twin by the ratio
cΔτ
cΔt


= (cΔt)
2 − (Δx)2


(cΔt)2
= (cΔt)


2 − (vΔt)2


(cΔt)2


= 1 − v
2


c2
= 1γ .


consistent with the time dilation formula. The twin paradox is therefore seen to be no paradox at all. The situation of thetwo twins is not symmetrical in the space-time diagram. The only surprise is perhaps that the seemingly longer path on thespace-time diagram corresponds to the smaller proper time interval, because of how Δτ and Δs depend on Δx and Δt.


Chapter 5 | Relativity 211




Figure 5.16 The space twin and the earthbound twin, in thetwin paradox example, follow world lines of different lengththrough space-time.
Lorentz transformations in space-time
We have already noted how the Lorentz transformation leaves


Δs2 = (Δx)2 + ⎛⎝Δy⎞⎠2 + (Δz)2 − (cΔt)2


unchanged and corresponds to a rotation of axes in the four-dimensional space-time. If the S and S′ frames are in relative
motion along their shared x-direction the space and time axes of S′ are rotated by an angle α as seen from S, in the way
shown in shown in Figure 5.17, where:


tanα = vc = β.


This differs from a rotation in the usual three-dimension sense, insofar as the two space-time axes rotate toward each othersymmetrically in a scissors-like way, as shown. The rotation of the time and space axes are both through the same angle.The mesh of dashed lines parallel to the two axes show how coordinates of an event would be read along the primed axes.This would be done by following a line parallel to the x′ and one parallel to the t′ -axis, as shown by the dashed lines. The
length scale of both axes are changed by:


ct′ = ct
1 + β2


1 − β2
; x′ = x


1 + β2


1 − β2
.


The line labeled “v = c” at 45° to the x-axis corresponds to the edge of the light cone, and is unaffected by the Lorentz
transformation, in accordance with the second postulate of relativity. The “v = c” line, and the light cone it represents, are
the same for both the S and S′ frame of reference.


212 Chapter 5 | Relativity


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 5.17 The Lorentz transformation results in new spaceand time axes rotated in a scissors-like way with respect to theoriginal axes.
Simultaneity
Simultaneity of events at separated locations depends on the frame of reference used to describe them, as given by thescissors-like “rotation” to new time and space coordinates as described. If two events have the same t values in the unprimedframe of reference, they need not have the same values measured along the ct′-axis, and would then not be simultaneous
in the primed frame.
As a specific example, consider the near-light-speed train in which flash lamps at the two ends of the car have flashedsimultaneously in the frame of reference of an observer on the ground. The space-time graph is shown Figure 5.18. Theflashes of the two lamps are represented by the dots labeled “Left flash lamp” and “Right flash lamp” that lie on the lightcone in the past. The world line of both pulses travel along the edge of the light cone to arrive at the observer on the groundsimultaneously. Their arrival is the event at the origin. They therefore had to be emitted simultaneously in the unprimedframe, as represented by the point labeled as t(both). But time is measured along the ct′-axis in the frame of reference of
the observer seated in the middle of the train car. So in her frame of reference, the emission event of the bulbs labeled as t′
(left) and t′ (right) were not simultaneous.


Chapter 5 | Relativity 213




Figure 5.18 The train example revisited. The flashes occur at the same timet(both) along the time axis of the ground observer, but at different times, alongthe t′ time axis of the passenger.


In terms of the space-time diagram, the two observers are merely using different time axes for the same events becausethey are in different inertial frames, and the conclusions of both observers are equally valid. As the analysis in terms of thespace-time diagrams further suggests, the property of how simultaneity of events depends on the frame of reference resultsfrom the properties of space and time itself, rather than from anything specifically about electromagnetism.
5.6 | Relativistic Velocity Transformation


Learning Objectives
By the end of this section, you will be able to:
• Derive the equations consistent with special relativity for transforming velocities in one inertialframe of reference into another.
• Apply the velocity transformation equations to objects moving at relativistic speeds.
• Examine how the combined velocities predicted by the relativistic transformation equationscompare with those expected classically.


Remaining in place in a kayak in a fast-moving river takes effort. The river current pulls the kayak along. Trying to paddleagainst the flow can move the kayak upstream relative to the water, but that only accounts for part of its velocity relative tothe shore. The kayak’s motion is an example of how velocities in Newtonian mechanics combine by vector addition. Thekayak’s velocity is the vector sum of its velocity relative to the water and the water’s velocity relative to the riverbank.However, the relativistic addition of velocities is quite different.
Velocity Transformations
Imagine a car traveling at night along a straight road, as in Figure 5.19. The driver sees the light leaving the headlightsat speed c within the car’s frame of reference. If the Galilean transformation applied to light, then the light from the car’s


214 Chapter 5 | Relativity


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




headlights would approach the pedestrian at a speed u = v + c, contrary to Einstein’s postulates.


Figure 5.19 According to experimental results and the second postulate of relativity, light from the car’s headlights movesaway from the car at speed c and toward the observer on the sidewalk at speed c.


Both the distance traveled and the time of travel are different in the two frames of reference, and they must differ in a waythat makes the speed of light the same in all inertial frames. The correct rules for transforming velocities from one frame toanother can be obtained from the Lorentz transformation equations.
Relativistic Transformation of Velocity
Suppose an object P is moving at constant velocity u = ⎛⎝ux′ , uy′ , uz′ ⎞⎠ as measured in the S′ frame. The S′ frame is moving
along its x′-axis at velocity v. In an increment of time dt′ , the particle is displaced by dx′ along the x′-axis. Applying
the Lorentz transformation equations gives the corresponding increments of time and displacement in the unprimed axes:


dt = γ⎛⎝dt′ + vdx′ /c
2⎞


dx = γ(dx′ + vdt′)


dy = dy′
dz = dz′.


The velocity components of the particle seen in the unprimed coordinate system are then
dx
dt


=
γ(dx′ + vdt′)


γ⎛⎝dt′ + vdx′/c
2⎞


=
dx′
dt′


+ v


1 + v
c2


dx′
dt′


dy
dt


=
dy′


γ⎛⎝dt′ + vdx′/c
2⎞


=
dy′
dt′


γ

⎝1 +


v


c2
dx′
dt′



dz
dt


= dz′
γ⎛⎝dt′ + vdx′/c


2⎞


=
dz′
dt′


γ

⎝1 +


v


c2
dx′
dt′



.


We thus obtain the equations for the velocity components of the object as seen in frame S:
ux =



⎜ ux′ + v


1 + vux′ /c
2





⎟, uy =







uy′ /γ


1 + vux′ /c
2





⎟, uz =







uz′ /γ


1 + vux′ /c
2





⎟.


Chapter 5 | Relativity 215




Compare this with how the Galilean transformation of classical mechanics says the velocities transform, by adding simplyas vectors:
ux = ux′ + u, uy = uy′ , uz = uz′ .


When the relative velocity of the frames is much smaller than the speed of light, that is, when v ≪ c, the special relativity
velocity addition law reduces to the Galilean velocity law. When the speed v of S′ relative to S is comparable to the speed
of light, the relativistic velocity addition law gives a much smaller result than the classical (Galilean) velocity additiondoes.
Example 5.9


Velocity Transformation Equations for Light
Suppose a spaceship heading directly toward Earth at half the speed of light sends a signal to us on a laser-produced beam of light (Figure 5.20). Given that the light leaves the ship at speed c as observed from the ship,calculate the speed at which it approaches Earth.


Figure 5.20 How fast does a light signal approach Earth if sent from aspaceship traveling at 0.500c?


Strategy
Because the light and the spaceship are moving at relativistic speeds, we cannot use simple velocity addition.Instead, we determine the speed at which the light approaches Earth using relativistic velocity addition.
Solutiona. Identify the knowns: v = 0.500c; u′ = c.


b. Identify the unknown: u.
c. Express the answer as an equation: u = v + u′


1 + vu′
c2


.


d. Do the calculation:
u = v + u′


1 + vu′
c2


= 0.500c + c
1 + (0.500c)(c)


c2


= (0.500 + 1)c⎛

c2 + 0.500c2


c2



= c.


Significance
Relativistic velocity addition gives the correct result. Light leaves the ship at speed c and approaches Earth atspeed c. The speed of light is independent of the relative motion of source and observer, whether the observer ison the ship or earthbound.


Velocities cannot add to greater than the speed of light, provided that v is less than c and u′ does not exceed c. The
following example illustrates that relativistic velocity addition is not as symmetric as classical velocity addition.


216 Chapter 5 | Relativity


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Example 5.10
Relativistic Package Delivery
Suppose the spaceship in the previous example approaches Earth at half the speed of light and shoots a canisterat a speed of 0.750c (Figure 5.21). (a) At what velocity does an earthbound observer see the canister if it is shotdirectly toward Earth? (b) If it is shot directly away from Earth?


Figure 5.21 A canister is fired at 0.7500c toward Earth or away from Earth.


Strategy
Because the canister and the spaceship are moving at relativistic speeds, we must determine the speed of thecanister by an earthbound observer using relativistic velocity addition instead of simple velocity addition.
Solution for (a)a. Identify the knowns: v = 0.500c; u′ = 0.750c.


b. Identify the unknown: u.
c. Express the answer as an equation: u = v + u′


1 + vu′
c2


.


d. Do the calculation:
u = v + u′


1 + vu′
c2


= 0.500c + 0.750c
1 + (0.500c)(0.750c)


c2


= 0.909c.


Solution for (b)a. Identify the knowns: v = 0.500c; u′ = −0.750c.
b. Identify the unknown: u.
c. Express the answer as an equation: u = v + u′


1 + vu′
c2


.


d. Do the calculation:
u = v + u′


1 + vu′
c2


= 0.500c + (−0.750c)


1 + (0.500c)(−0.750c)
c2


= −0.400c.


Chapter 5 | Relativity 217




5.6


Significance
The minus sign indicates a velocity away from Earth (in the opposite direction from v), which means the canisteris heading toward Earth in part (a) and away in part (b), as expected. But relativistic velocities do not add assimply as they do classically. In part (a), the canister does approach Earth faster, but at less than the vectorsum of the velocities, which would give 1.250c. In part (b), the canister moves away from Earth at a velocityof −0.400c, which is faster than the −0.250c expected classically. The differences in velocities are not even
symmetric: In part (a), an observer on Earth sees the canister and the ship moving apart at a speed of 0.409c, andat a speed of 0.900c in part (b).


Check Your Understanding Distances along a direction perpendicular to the relative motion of the twoframes are the same in both frames. Why then are velocities perpendicular to the x-direction different in the twoframes?


5.7 | Doppler Effect for Light
Learning Objectives


By the end of this section, you will be able to:
• Explain the origin of the shift in frequency and wavelength of the observed wavelength whenobserver and source moved toward or away from each other
• Derive an expression for the relativistic Doppler shift
• Apply the Doppler shift equations to real-world examples


As discussed in the chapter on sound, if a source of sound and a listener are moving farther apart, the listener encountersfewer cycles of a wave in each second, and therefore lower frequency, than if their separation remains constant. For thesame reason, the listener detects a higher frequency if the source and listener are getting closer. The resulting Doppler shiftin detected frequency occurs for any form of wave. For sound waves, however, the equations for the Doppler shift differmarkedly depending on whether it is the source, the observer, or the air, which is moving. Light requires no medium, andthe Doppler shift for light traveling in vacuum depends only on the relative speed of the observer and source.
The Relativistic Doppler Effect
Suppose an observer in S sees light from a source in S′ moving away at velocity v (Figure 5.22). The wavelength of the
light could be measured within S′—for example, by using a mirror to set up standing waves and measuring the distance
between nodes. These distances are proper lengths with S′ as their rest frame, and change by a factor 1 − v2/c2 when
measured in the observer’s frame S, where the ruler measuring the wavelength in S′ is seen as moving.


218 Chapter 5 | Relativity


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 5.22 (a) When a light wave is emitted by a source fixed inthe moving inertial frame S′, the observer in S sees the wavelength
measured in S′. to be shorter by a factor 1 − v2 /c2. (b) Because
the observer sees the source moving away within S, the wave patternreaching the observer in S is also stretched by the factor
(cΔt + vΔt)/(cΔt) = 1 + v/c.


If the source were stationary in S, the observer would see a length cΔt of the wave pattern in time Δt. But because of the
motion of S′ relative to S, considered solely within S, the observer sees the wave pattern, and therefore the wavelength,
stretched out by a factor of


cΔtperiod + vΔtperiod
cΔtperiod


= 1 + vc


as illustrated in (b) of Figure 5.22. The overall increase from both effects gives
λobs = λsrc



⎝1 + vc





1


1 − v
2


c2


= λsrc

⎝1 + vc





1

⎝1 + vc





⎝1 − vc




= λsrc



⎝1 + vc






⎝1 − vc





where λsrc is the wavelength of the light seen by the source in S′ and λobs is the wavelength that the observer detects
within S.
Red Shifts and Blue Shifts
The observed wavelength λobs of electromagnetic radiation is longer (called a “red shift”) than that emitted by the source
when the source moves away from the observer. Similarly, the wavelength is shorter (called a “blue shift”) when the sourcemoves toward the observer. The amount of change is determined by


λobs = λs
1 + vc
1 − vc


where λs is the wavelength in the frame of reference of the source, and v is the relative velocity of the two frames S and


Chapter 5 | Relativity 219




5.7


S′. The velocity v is positive for motion away from an observer and negative for motion toward an observer. In terms of
source frequency and observed frequency, this equation can be written as


fobs = fs
1 − vc
1 + vc


.


Notice that the signs are different from those of the wavelength equation.
Example 5.11


Calculating a Doppler Shift
Suppose a galaxy is moving away from Earth at a speed 0.825c. It emits radio waves with a wavelength of0.525 m. What wavelength would we detect on Earth?
Strategy
Because the galaxy is moving at a relativistic speed, we must determine the Doppler shift of the radio waves usingthe relativistic Doppler shift instead of the classical Doppler shift.
Solutiona. Identify the knowns: u = 0.825c; λs = 0.525 m.


b. Identify the unknown: λobs.
c. Express the answer as an equation:


λobs = λs
1 + vc
1 − vc


.


d. Do the calculation:


λobs = λs
1 + vc
1 − vc


= (0.525 m)
1 + 0.825cc
1 − 0.825cc


= 1.70 m.


Significance
Because the galaxy is moving away from Earth, we expect the wavelengths of radiation it emits to be redshifted.The wavelength we calculated is 1.70 m, which is redshifted from the original wavelength of 0.525 m. You willsee in Particle Physics and Cosmology that detecting redshifted radiation led to present-day understandingof the origin and evolution of the universe.


Check Your Understanding Suppose a space probe moves away from Earth at a speed 0.350c. It sends aradio-wave message back to Earth at a frequency of 1.50 GHz. At what frequency is the message received onEarth?
The relativistic Doppler effect has applications ranging from Doppler radar storm monitoring to providing information onthe motion and distance of stars. We describe some of these applications in the exercises.


220 Chapter 5 | Relativity


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




5.8 | Relativistic Momentum
Learning Objectives


By the end of this section, you will be able to:
• Define relativistic momentum in terms of mass and velocity
• Show how relativistic momentum relates to classical momentum
• Show how conservation of relativistic momentum limits objects with mass to speeds less than


c


Momentum is a central concept in physics. The broadest form of Newton’s second law is stated in terms of momentum.Momentum is conserved whenever the net external force on a system is zero. This makes momentum conservation afundamental tool for analyzing collisions (Figure 5.23). Much of what we know about subatomic structure comes fromthe analysis of collisions of accelerator-produced relativistic particles, and momentum conservation plays a crucial role inthis analysis.


Figure 5.23 Momentum is an important concept for thesefootball players from the University of California at Berkeleyand the University of California at Davis. A player with thesame velocity but greater mass collides with greater impactbecause his momentum is greater. For objects moving atrelativistic speeds, the effect is even greater.


The first postulate of relativity states that the laws of physics are the same in all inertial frames. Does the law of conservationof momentum survive this requirement at high velocities? It can be shown that the momentum calculated as merely
p


= md x

dt


, even if it is conserved in one frame of reference, may not be conserved in another after applying the
Lorentz transformation to the velocities. The correct equation for momentum can be shown, instead, to be the classicalexpression in terms of the increment dτ of proper time of the particle, observed in the particle’s rest frame:


Chapter 5 | Relativity 221




p


= md x



= md x

dt


dt


= md x

dt


1
1 − u2 /c2


= m u


1 − u2 /c2
= γm u→ .


Relativistic Momentum
Relativistic momentum p→ is classical momentum multiplied by the relativistic factor γ:


(5.6)p→ = γm u→
where m is the rest mass of the object, u→ is its velocity relative to an observer, and γ is the relativistic factor:


(5.7)γ = 1
1 − u


2


c2


.


Note that we use u for velocity here to distinguish it from relative velocity v between observers. The factor γ that occurs
here has the same form as the previous relativistic factor γ except that it is now in terms of the velocity of the particle u
instead of the relative velocity v of two frames of reference.
With p expressed in this way, total momentum ptot is conserved whenever the net external force is zero, just as in
classical physics. Again we see that the relativistic quantity becomes virtually the same as the classical quantity at lowvelocities, where u/c is small and γ is very nearly equal to 1. Relativistic momentum has the same intuitive role as classical
momentum. It is greatest for large masses moving at high velocities, but because of the factor γ, relativistic momentum
approaches infinity as u approaches c (Figure 5.24). This is another indication that an object with mass cannot reach thespeed of light. If it did, its momentum would become infinite—an unreasonable value.


Figure 5.24 Relativistic momentum approaches infinity as thevelocity of an object approaches the speed of light.


The relativistically correct definition of momentum as p = γmu is sometimes taken to imply that mass varies with velocity:
mvar = γm, particularly in older textbooks. However, note that m is the mass of the object as measured by a person at rest
relative to the object. Thus, m is defined to be the rest mass, which could be measured at rest, perhaps using gravity. Whena mass is moving relative to an observer, the only way that its mass can be determined is through collisions or other meansinvolving momentum. Because the mass of a moving object cannot be determined independently of momentum, the onlymeaningful mass is rest mass. Therefore, when we use the term “mass,” assume it to be identical to “rest mass.”
Relativistic momentum is defined in such a way that conservation of momentum holds in all inertial frames. Whenever thenet external force on a system is zero, relativistic momentum is conserved, just as is the case for classical momentum. This


222 Chapter 5 | Relativity


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




5.8
has been verified in numerous experiments.


Check Your Understanding What is the momentum of an electron traveling at a speed 0.985c? The rest
mass of the electron is 9.11 × 10−31kg.


5.9 | Relativistic Energy
Learning Objectives


By the end of this section, you will be able to:
• Explain how the work-energy theorem leads to an expression for the relativistic kinetic energyof an object
• Show how the relativistic energy relates to the classical kinetic energy, and sets a limit on thespeed of any object with mass
• Describe how the total energy of a particle is related to its mass and velocity
• Explain how relativity relates to energy-mass equivalence, and some of the practicalimplications of energy-mass equivalence


The tokamak in Figure 5.25 is a form of experimental fusion reactor, which can change mass to energy. Nuclear reactorsare proof of the relationship between energy and matter.
Conservation of energy is one of the most important laws in physics. Not only does energy have many important forms, buteach form can be converted to any other. We know that classically, the total amount of energy in a system remains constant.Relativistically, energy is still conserved, but energy-mass equivalence must now be taken into account, for example, in thereactions that occur within a nuclear reactor. Relativistic energy is intentionally defined so that it is conserved in all inertialframes, just as is the case for relativistic momentum. As a consequence, several fundamental quantities are related in waysnot known in classical physics. All of these relationships have been verified by experimental results and have fundamentalconsequences. The altered definition of energy contains some of the most fundamental and spectacular new insights intonature in recent history.


Figure 5.25 The National Spherical Torus Experiment(NSTX) is a fusion reactor in which hydrogen isotopes undergofusion to produce helium. In this process, a relatively small massof fuel is converted into a large amount of energy. (credit:Princeton Plasma Physics Laboratory)
Kinetic Energy and the Ultimate Speed Limit
The first postulate of relativity states that the laws of physics are the same in all inertial frames. Einstein showed that thelaw of conservation of energy of a particle is valid relativistically, but for energy expressed in terms of velocity and mass in


Chapter 5 | Relativity 223




a way consistent with relativity.
Consider first the relativistic expression for the kinetic energy. We again use u for velocity to distinguish it fromrelative velocity v between observers. Classically, kinetic energy is related to mass and speed by the familiar expression
K = 1


2
mu2. The corresponding relativistic expression for kinetic energy can be obtained from the work-energy theorem.


This theorem states that the net work on a system goes into kinetic energy. Specifically, if a force, expressed as
F


=
d p


dt
= m


d⎛⎝γ u
→ ⎞


dt
, accelerates a particle from rest to its final velocity, the work done on the particle should be equal


to its final kinetic energy. In mathematical form, for one-dimensional motion:
K = ⌠



Fdx = ⌠



m d
dt


(γu)dx


= m⌠

d(γu)
dt


dx
dt


dt = m⌠

u d
dt





⎜ u


1 − (u/c)2





⎟dt.


Integrate this by parts to obtain
K = mu


2


1 − (u/c)2 |0u − m⌠⌡ u1 − (u/c)2 dudt dt
= mu


2


1 − (u/c)2
− m⌠


u


1 − (u/c)2
du


= mu
2


1 − (u/c)2
− mc2 ⎛⎝ 1 − (u/c)


2⎞
⎠|0
u


= mu
2


1 − (u/c)2
+ mc


2


1 − (u/c)2
− mc2


= mc2



⎢(u


2 /c2) + 1 − (u2 /c2)


1 − (u/c)2





⎥− mc2


K = mc
2


1 − (u/c)2
− mc2.


Relativistic Kinetic Energy
Relativistic kinetic energy of any particle of mass m is


(5.8)Krel = (γ − 1)mc2.
When an object is motionless, its speed is u = 0 and


γ = 1


1 − u
2


c2


= 1


so that Krel = 0 at rest, as expected. But the expression for relativistic kinetic energy (such as total energy and rest energy)
does not look much like the classical 1


2
mu2. To show that the expression for Krel reduces to the classical expression for


kinetic energy at low speeds, we use the binomial expansion to obtain an approximation for (1 + ε)n valid for small ε :
(1 + ε)n = 1 + nε + n(n − 1)


2!
ε2 + n(n − 1)(n − 2)


3!
ε3 + ⋯ ≈ 1 + nε


by neglecting the very small terms in ε2 and higher powers of ε. Choosing ε = −u2/c2 and n = − 1
2
leads to the


conclusion that γ at nonrelativistic speeds, where ε = u/c is small, satisfies


224 Chapter 5 | Relativity


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




γ = ⎛⎝1 − u
2/c2⎞⎠


−1/2
≈ 1 + 1


2


u2


c2

⎠.


A binomial expansion is a way of expressing an algebraic quantity as a sum of an infinite series of terms. In some cases, asin the limit of small speed here, most terms are very small. Thus, the expression derived here for γ is not exact, but it is a
very accurate approximation. Therefore, at low speed:


γ − 1 = 1
2


u2


c2

⎠.


Entering this into the expression for relativistic kinetic energy gives
Krel =


1
2


u2


c2



⎦mc


2 = 1
2
mu2 = Kclass.


That is, relativistic kinetic energy becomes the same as classical kinetic energy when u<<c.
It is even more interesting to investigate what happens to kinetic energy when the speed of an object approaches the speedof light. We know that γ becomes infinite as u approaches c, so that Krel also becomes infinite as the velocity approaches
the speed of light (Figure 5.26). The increase in Krel is far larger than in Kclass as v approaches c. An infinite amount of
work (and, hence, an infinite amount of energy input) is required to accelerate a mass to the speed of light.


The Speed of Light
No object with mass can attain the speed of light.


The speed of light is the ultimate speed limit for any particle having mass. All of this is consistent with the fact that velocitiesless than c always add to less than c. Both the relativistic form for kinetic energy and the ultimate speed limit being c havebeen confirmed in detail in numerous experiments. No matter how much energy is put into accelerating a mass, its velocitycan only approach—not reach—the speed of light.


Figure 5.26 This graph of Krel versus velocity shows how
kinetic energy increases without bound as velocity approachesthe speed of light. Also shown is Kclass, the classical kinetic
energy.


Example 5.12
Comparing Kinetic Energy
An electron has a velocity v = 0.990c. (a) Calculate the kinetic energy in MeV of the electron. (b) Compare this
with the classical value for kinetic energy at this velocity. (The mass of an electron is 9.11 × 10−31kg. )


Chapter 5 | Relativity 225




Strategy
The expression for relativistic kinetic energy is always correct, but for (a), it must be used because the velocityis highly relativistic (close to c). First, we calculate the relativistic factor γ, and then use it to determine
the relativistic kinetic energy. For (b), we calculate the classical kinetic energy (which would be close to therelativistic value if v were less than a few percent of c) and see that it is not the same.
Solution for (a)
For part (a):


a. Identify the knowns: v = 0.990c; m = 9.11 × 10−31kg.
b. Identify the unknown: Krel.
c. Express the answer as an equation: Krel = (γ − 1)mc2 with γ = 1


1 − u2 /c2
.


d. Do the calculation. First calculate γ. Keep extra digits because this is an intermediate calculation:
γ = 1


1 − u
2


c2


= 1


1 − (0.990c)
2


c2


= 7.0888.


Now use this value to calculate the kinetic energy:
Krel = (γ − 1)mc


2


= (7.0888 − 1)(9.11 × 10−31 kg)(3.00 × 108 m/s2)


= 4.9922 × 10−13 J.
e. Convert units:


Krel = (4.9922 × 10
−13 J)



1MeV
1.60 × 10−13 J





= 3.12MeV.


Solution for (b)
For part (b):


a. List the knowns: v = 0.990c; m = 9.11 × 10−31kg.
b. List the unknown: Krel.
c. Express the answer as an equation: Kclass = 12 mu2.
d. Do the calculation:


Kclass =
1
2
mu2


= 1
2
(9.11 × 10−31 kg)(0.990)2 (3.00 × 108 m/s)2


= 4.0179 × 10−14 J.
e. Convert units:


226 Chapter 5 | Relativity


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Kclass = 4.0179 × 10
−14 J



1MeV
1.60 × 10−13 J





= 0.251Mev.


Significance
As might be expected, because the velocity is 99.0% of the speed of light, the classical kinetic energy differssignificantly from the correct relativistic value. Note also that the classical value is much smaller than therelativistic value. In fact, Krel /Kclass = 12.4 in this case. This illustrates how difficult it is to get a mass moving
close to the speed of light. Much more energy is needed than predicted classically. Ever-increasing amounts ofenergy are needed to get the velocity of a mass a little closer to that of light. An energy of 3 MeV is a very smallamount for an electron, and it can be achieved with present-day particle accelerators. SLAC, for example, can
accelerate electrons to over 50 × 109 eV = 50,000MeV.


Is there any point in getting v a little closer to c than 99.0% or 99.9%? The answer is yes. We learn a great deal by doing this.The energy that goes into a high-velocity mass can be converted into any other form, including into entirely new particles.In the Large Hadron Collider in Figure 5.27, charged particles are accelerated before entering the ring-like structure.There, two beams of particles are accelerated to their final speed of about 99.7% the speed of light in opposite directions,and made to collide, producing totally new species of particles. Most of what we know about the substructure of matterand the collection of exotic short-lived particles in nature has been learned this way. Patterns in the characteristics of thesepreviously unknown particles hint at a basic substructure for all matter. These particles and some of their characteristics willbe discussed in a later chapter on particle physics.


Figure 5.27 The European Organization for Nuclear Research(called CERN after its French name) operates the largest particleaccelerator in the world, straddling the border between Franceand Switzerland.
Total Relativistic Energy
The expression for kinetic energy can be rearranged to:


E = mu
2


1 − u2 /c2
= K + mc2.


Einstein argued in a separate article, also later published in 1905, that if the energy of a particle changes by ΔE, its mass
changes by Δm = ΔE/c2. Abundant experimental evidence since then confirms that mc2 corresponds to the energy that
the particle of mass m has when at rest. For example, when a neutral pion of mass m at rest decays into two photons, the


Chapter 5 | Relativity 227




photons have zero mass but are observed to have total energy corresponding to mc2 for the pion. Similarly, when a particle
of mass m decays into two or more particles with smaller total mass, the observed kinetic energy imparted to the products
of the decay corresponds to the decrease in mass. Thus, E is the total relativistic energy of the particle, and mc2 is its rest
energy.


Total Energy
Total energy E of a particle is


(5.9)E = γmc2
where m is mass, c is the speed of light, γ = 1


1 − u
2


c2


, and u is the velocity of the mass relative to an observer.


Rest Energy
Rest energy of an object is


(5.10)E0 = mc2.
This is the correct form of Einstein’s most famous equation, which for the first time showed that energy is related to themass of an object at rest. For example, if energy is stored in the object, its rest mass increases. This also implies that masscan be destroyed to release energy. The implications of these first two equations regarding relativistic energy are so broadthat they were not completely recognized for some years after Einstein published them in 1905, nor was the experimentalproof that they are correct widely recognized at first. Einstein, it should be noted, did understand and describe the meaningsand implications of his theory.
Example 5.13


Calculating Rest Energy
Calculate the rest energy of a 1.00-g mass.
Strategy
One gram is a small mass—less than one-half the mass of a penny. We can multiply this mass, in SI units, by thespeed of light squared to find the equivalent rest energy.
Solution


a. Identify the knowns: m = 1.00 × 10−3kg; c = 3.00 × 108m/s.
b. Identify the unknown: E0.
c. Express the answer as an equation: E0 = mc2.
d. Do the calculation:


E0 = mc
2 = (1.00 × 10−3 kg⎞⎠(3.00 × 10


8 m/s)2


= 9.00 × 1013 kg ·m2 /s2.


e. Convert units. Noting that 1 kg ·m2 /s2 = 1 J, we see the rest energy is:
E0 = 9.00 × 10


13 J.


Significance
This is an enormous amount of energy for a 1.00-g mass. Rest energy is large because the speed of light c is a


228 Chapter 5 | Relativity


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




large number and c2 is a very large number, so that mc2 is huge for any macroscopic mass. The 9.00 × 1013 J
rest mass energy for 1.00 g is about twice the energy released by the Hiroshima atomic bomb and about 10,000times the kinetic energy of a large aircraft carrier.


Today, the practical applications of the conversion of mass into another form of energy, such as in nuclear weapons andnuclear power plants, are well known. But examples also existed when Einstein first proposed the correct form of relativisticenergy, and he did describe some of them. Nuclear radiation had been discovered in the previous decade, and it had beena mystery as to where its energy originated. The explanation was that, in some nuclear processes, a small amount of massis destroyed and energy is released and carried by nuclear radiation. But the amount of mass destroyed is so small that it isdifficult to detect that any is missing. Although Einstein proposed this as the source of energy in the radioactive salts thenbeing studied, it was many years before there was broad recognition that mass could be and, in fact, commonly is, convertedto energy (Figure 5.28).


Figure 5.28 (a) The sun and (b) the Susquehanna Steam Electric Station both convert mass into energy—the sun vianuclear fusion, and the electric station via nuclear fission. (credit a: modification of work by NASA; credit b: modificationof work by “ChNPP”/Wikimedia Commons)


Because of the relationship of rest energy to mass, we now consider mass to be a form of energy rather than somethingseparate. There had not been even a hint of this prior to Einstein’s work. Energy-mass equivalence is now known to be thesource of the sun’s energy, the energy of nuclear decay, and even one of the sources of energy keeping Earth’s interior hot.
Stored Energy and Potential Energy
What happens to energy stored in an object at rest, such as the energy put into a battery by charging it, or the energy storedin a toy gun’s compressed spring? The energy input becomes part of the total energy of the object and thus increases its restmass. All stored and potential energy becomes mass in a system. In seeming contradiction, the principle of conservationof mass (meaning total mass is constant) was one of the great laws verified by nineteenth-century science. Why was it notnoticed to be incorrect? The following example helps answer this question.
Example 5.14


Calculating Rest Mass
A car battery is rated to be able to move 600 ampere-hours (A · h) of charge at 12.0 V. (a) Calculate the increase
in rest mass of such a battery when it is taken from being fully depleted to being fully charged, assuming none ofthe chemical reactants enter or leave the battery. (b) What percent increase is this, given that the battery’s mass is20.0 kg?


Chapter 5 | Relativity 229




Strategy
In part (a), we first must find the energy stored as chemical energy Ebatt in the battery, which equals the electrical
energy the battery can provide. Because Ebatt = qV , we have to calculate the charge q in 600 A · h, which
is the product of the current I and the time t. We then multiply the result by 12.0 V. We can then calculate the
battery’s increase in mass using Ebatt = (Δm)c2. Part (b) is a simple ratio converted into a percentage.
Solution for (a)


a. Identify the knowns: I · t = 600 A · h; V = 12.0 V; c = 3.00 × 108 m/s.
b. Identify the unknown: Δm.
c. Express the answer as an equation:


Ebatt = (Δm)c
2


Δm =
Ebatt
c2


=
qV


c2


= (It)V
c2


.


d. Do the calculation:
Δm = (600 A · h





⎝12.0 V⎞⎠


(3.00 × 108)2
.


Write amperes A as coulombs per second (C/s), and convert hours into seconds:
Δm =


(600 C/s · h)⎛⎝
3600 s
1 h

⎠(12.0 J/C)


(3.00 × 108 m/s)2


= 2.88 × 10−10 kg.


where we have used the conversion 1 kg ·m2 /s2 = 1 J.
Solution for (b)
For part (b):


a. Identify the knowns: Δm = 2.88 × 10−10kg;m = 20.0 kg.
b. Identify the unknown: % change.
c. Express the answer as an equation: % increase = Δmm × 100%.
d. Do the calculation:


% increase = Δmm × 100%


=
2.88 × 10−10 kg


20.0 kg
× 100%


= 1.44 × 10−9%.


Significance
Both the actual increase in mass and the percent increase are very small, because energy is divided by c2, a
very large number. We would have to be able to measure the mass of the battery to a precision of a billionth of a
percent, or 1 part in 1011, to notice this increase. It is no wonder that the mass variation is not readily observed.


230 Chapter 5 | Relativity


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




5.9


In fact, this change in mass is so small that we may question how anyone could verify that it is real. The answer isfound in nuclear processes in which the percentage of mass destroyed is large enough to be measured accurately.The mass of the fuel of a nuclear reactor, for example, is measurably smaller when its energy has been used. Inthat case, stored energy has been released (converted mostly into thermal energy to power electric generators) andthe rest mass has decreased. A decrease in mass also occurs from using the energy stored in a battery, except thatthe stored energy is much greater in nuclear processes, making the change in mass measurable in practice as wellas in theory.


Relativistic Energy and Momentum
We know classically that kinetic energy and momentum are related to each other, because:


Kclass =
p2


2m
= (mu)


2


2m
= 1


2
mu2.


Relativistically, we can obtain a relationship between energy and momentum by algebraically manipulating their definingequations. This yields:
(5.11)E2 = (pc)2 + (mc2)2,


where E is the relativistic total energy, E = mc2 / 1 − u2 /c2, and p is the relativistic momentum. This relationship
between relativistic energy and relativistic momentum is more complicated than the classical version, but we can gain someinteresting new insights by examining it. First, total energy is related to momentum and rest mass. At rest, momentum is
zero, and the equation gives the total energy to be the rest energy mc2 (so this equation is consistent with the discussion
of rest energy above). However, as the mass is accelerated, its momentum p increases, thus increasing the total energy. At
sufficiently high velocities, the rest energy term (mc2)2 becomes negligible compared with the momentum term (pc)2;
thus, E = pc at extremely relativistic velocities.
If we consider momentum p to be distinct from mass, we can determine the implications of the equation
E2 = (pc)2 + (mc2)2, for a particle that has no mass. If we take m to be zero in this equation, then E = pc, orp = E/c.
Massless particles have this momentum. There are several massless particles found in nature, including photons (whichare packets of electromagnetic radiation). Another implication is that a massless particle must travel at speed c and only at
speed c. It is beyond the scope of this text to examine the relationship in the equation E2 = (pc)2 + (mc2)2 in detail, but
you can see that the relationship has important implications in special relativity.


Check Your Understanding What is the kinetic energy of an electron if its speed is 0.992c?


Chapter 5 | Relativity 231




classical (Galilean) velocity addition


event
first postulate of special relativity
Galilean relativity


Galilean transformation
inertial frame of reference
length contraction
Lorentz transformation
Michelson-Morley experiment
proper length


proper time
relativistic kinetic energy
relativistic momentum
relativistic velocity addition
rest energy
rest frame
rest mass
second postulate of special relativity
special theory of relativity
speed of light
time dilation
total energy


CHAPTER 5 REVIEW
KEY TERMS


method of adding velocities when v<<c; velocities add like regular numbers
in one-dimensional motion: u = v + u′, where v is the velocity between two observers, u is the velocity of an object
relative to one observer, and u′ is the velocity relative to the other observer
occurrence in space and time specified by its position and time coordinates (x, y, z, t) measured relative to a frameof reference


laws of physics are the same in all inertial frames of reference
if an observer measures a velocity in one frame of reference, and that frame of reference is movingwith a velocity past a second reference frame, an observer in the second frame measures the original velocity as thevector sum of these velocities


relation between position and time coordinates of the same events as seen in differentreference frames, according to classical mechanics
reference frame in which a body at rest remains at rest and a body in motion moves at aconstant speed in a straight line unless acted on by an outside force


decrease in observed length of an object from its proper length L0 to length L when its length is
observed in a reference frame where it is traveling at speed v


relation between position and time coordinates of the same events as seen in differentreference frames, according to the special theory of relativity
investigation performed in 1887 that showed that the speed of light in a vacuum is thesame in all frames of reference from which it is viewed


L0; the distance between two points measured by an observer who is at rest relative to both of the points;
for example, earthbound observers measure proper length when measuring the distance between two points that arestationary relative to Earth


Δτ is the time interval measured by an observer who sees the beginning and end of the process that the
time interval measures occur at the same location


kinetic energy of an object moving at relativistic speeds
p


, the momentum of an object moving at relativistic velocity; p→ = γm u→
method of adding velocities of an object moving at a relativistic speeds


energy stored in an object at rest: E0 = mc2
frame of reference in which the observer is at rest
mass of an object as measured by an observer at rest relative to the object


light travels in a vacuum with the same speed c in any direction in all inertialframes
theory that Albert Einstein proposed in 1905 that assumes all the laws of physics have thesame form in every inertial frame of reference, and that the speed of light is the same within all inertial frames


ultimate speed limit for any particle having mass
lengthening of the time interval between two events when seen in a moving inertial frame rather than therest frame of the events (in which the events occur at the same location)
sum of all energies for a particle, including rest energy and kinetic energy, given for a particle of mass m


232 Chapter 5 | Relativity


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




world line


and speed u by E = γmc2, where γ = 1
1 − u


2


c2


path through space-time


KEY EQUATIONS
Time dilation Δt =


Δτ


1 − v
2


c2


= γτ


Lorentz factor γ = 11 − v2
c2


Length contraction L = L0 1 − v2
c2


=
L0
γ


Galilean transformation x = x′ + vt, y = y′, z = z′, t = t′
Lorentz transformation t = t′ + vx′/c2


1 − v2 /c2


x = x′ + vt′


1 − v2 /c2


y = y′


z = z′


Inverse Lorentz transformation t′ = t − vx/c2
1 − v2 /c2


x′ = x − vt


1 − v2 /c2


y′ = y


z′ = z


Space-time invariants (Δs)2 = (Δx)2 + ⎛⎝Δy⎞⎠2 + (Δz)2 − c2 (Δt)2


(Δτ)2 = − (Δs)2 /c2 = (Δt)2 −

⎣(Δx)


2 + ⎛⎝Δy⎞⎠2 + (Δz)2⎤⎦


c2


Relativistic velocity addition ux = ⎛

⎜ ux′ + v


1 + vux′ /c
2





⎟, uy =







uy′ /γ


1 + vux′ /c
2





⎟, uz =







uz′ /γ


1 + vux′ /c
2







Relativistic Doppler effect for wavelength λobs = λs 1 + vc1 − vc
Relativistic Doppler effect for frequency fobs = fs 1 − vc1 + vc
Relativistic momentum p→ = γm u→ = m u




1 − u
2
c


Chapter 5 | Relativity 233




Relativistic total energy E = γmc
2, where γ = 1


1 − u
2


c2


Relativistic kinetic energy Krel = (γ − 1)mc
2, where γ = 1


1 − u
2


c2


SUMMARY
5.1 Invariance of Physical Laws


• Relativity is the study of how observers in different reference frames measure the same event.
• Modern relativity is divided into two parts. Special relativity deals with observers in uniform (unaccelerated)motion, whereas general relativity includes accelerated relative motion and gravity. Modern relativity is consistentwith all empirical evidence thus far and, in the limit of low velocity and weak gravitation, gives close agreementwith the predictions of classical (Galilean) relativity.
• An inertial frame of reference is a reference frame in which a body at rest remains at rest and a body in motionmoves at a constant speed in a straight line unless acted upon by an outside force.
• Modern relativity is based on Einstein’s two postulates. The first postulate of special relativity is that the laws ofphysics are the same in all inertial frames of reference. The second postulate of special relativity is that the speed oflight c is the same in all inertial frames of reference, independent of the relative motion of the observer and the lightsource.
• The Michelson-Morley experiment demonstrated that the speed of light in a vacuum is independent of the motionof Earth about the sun.


5.2 Relativity of Simultaneity
• Two events are defined to be simultaneous if an observer measures them as occurring at the same time (such as byreceiving light from the events).
• Two events at locations a distance apart that are simultaneous for an observer at rest in one frame of reference arenot necessarily simultaneous for an observer at rest in a different frame of reference.


5.3 Time Dilation
• Two events are defined to be simultaneous if an observer measures them as occurring at the same time. They are notnecessarily simultaneous to all observers—simultaneity is not absolute.
• Time dilation is the lengthening of the time interval between two events when seen in a moving inertial frame ratherthan the rest frame of the events (in which the events occur at the same location).
• Observers moving at a relative velocity v do not measure the same elapsed time between two events. Proper time


Δτ is the time measured in the reference frame where the start and end of the time interval occur at the same
location. The time interval Δt measured by an observer who sees the frame of events moving at speed v is related
to the proper time interval Δτ of the events by the equation:


Δt = Δτ


1 − v
2


c2


= γΔτ,


where
γ = 1


1 − v
2


c2


.


• The premise of the twin paradox is faulty because the traveling twin is accelerating. The journey is not symmetricalfor the two twins.


234 Chapter 5 | Relativity


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




• Time dilation is usually negligible at low relative velocities, but it does occur, and it has been verified byexperiment.
• The proper time is the shortest measure of any time interval. Any observer who is moving relative to the systembeing observed measures a time interval longer than the proper time.


5.4 Length Contraction
• All observers agree upon relative speed.
• Distance depends on an observer’s motion. Proper length L0 is the distance between two points measured by an
observer who is at rest relative to both of the points.


• Length contraction is the decrease in observed length of an object from its proper length L0 to length L when its
length is observed in a reference frame where it is traveling at speed v.


• The proper length is the longest measurement of any length interval. Any observer who is moving relative to thesystem being observed measures a length shorter than the proper length.
5.5 The Lorentz Transformation


• The Galilean transformation equations describe how, in classical nonrelativistic mechanics, the position, velocity,and accelerations measured in one frame appear in another. Lengths remain unchanged and a single universal timescale is assumed to apply to all inertial frames.
• Newton’s laws of mechanics obey the principle of having the same form in all inertial frames under a Galileantransformation, given by


x = x′ + vt, y = y′, z = z′, t = t′.


The concept that times and distances are the same in all inertial frames in the Galilean transformation, however, isinconsistent with the postulates of special relativity.
• The relativistically correct Lorentz transformation equations are


Lorentz transformation Inverse Lorentz transformation


t = t′ + vx′/c
2


1 − v2 /c2
t′ = t − vx/c


2


1 − v2 /c2


x = x′ + vt′


1 − v2 /c2
x′ = x − vt


1 − v2 /c2


y = y′ y′ = y


z = z′ z′ = z


We can obtain these equations by requiring an expanding spherical light signal to have the same shape and speed ofgrowth, c, in both reference frames.
• Relativistic phenomena can be explained in terms of the geometrical properties of four-dimensional space-time, inwhich Lorentz transformations correspond to rotations of axes.
• The Lorentz transformation corresponds to a space-time axis rotation, similar in some ways to a rotation of spaceaxes, but in which the invariant spatial separation is given by Δs rather than distances Δr, and that the Lorentz
transformation involving the time axis does not preserve perpendicularity of axes or the scales along the axes.


• The analysis of relativistic phenomena in terms of space-time diagrams supports the conclusion that thesephenomena result from properties of space and time itself, rather than from the laws of electromagnetism.
5.6 Relativistic Velocity Transformation


• With classical velocity addition, velocities add like regular numbers in one-dimensional motion: u = v + u′,
where v is the velocity between two observers, u is the velocity of an object relative to one observer, and u′ is the


Chapter 5 | Relativity 235




velocity relative to the other observer.
• Velocities cannot add to be greater than the speed of light.
• Relativistic velocity addition describes the velocities of an object moving at a relativistic velocity.


5.7 Doppler Effect for Light
• An observer of electromagnetic radiation sees relativistic Doppler effects if the source of the radiation is movingrelative to the observer. The wavelength of the radiation is longer (called a red shift) than that emitted by the sourcewhen the source moves away from the observer and shorter (called a blue shift) when the source moves toward theobserver. The shifted wavelength is described by the equation:


λobs = λs
1 + vc
1 − vc


.


where λobs is the observed wavelength, λs is the source wavelength, and v is the relative velocity of the source to
the observer.


5.8 Relativistic Momentum
• The law of conservation of momentum is valid for relativistic momentum whenever the net external force is zero.The relativistic momentum is p = γmu, where m is the rest mass of the object, u is its velocity relative to an
observer, and the relativistic factor is γ = 1


1 − u
2


c2


.


• At low velocities, relativistic momentum is equivalent to classical momentum.
• Relativistic momentum approaches infinity as u approaches c. This implies that an object with mass cannot reachthe speed of light.


5.9 Relativistic Energy
• The relativistic work-energy theorem is Wnet = E − E0 = γmc2 − mc2 = (γ − 1)mc2.
• Relativistically, Wnet = Krel where Krel is the relativistic kinetic energy.
• An object of mass m at velocity u has kinetic energy Krel = (γ − 1)mc2, where γ = 1


1 − u
2


c2


.


• At low velocities, relativistic kinetic energy reduces to classical kinetic energy.
• No object with mass can attain the speed of light, because an infinite amount of work and an infinite amount ofenergy input is required to accelerate a mass to the speed of light.
• Relativistic energy is conserved as long as we define it to include the possibility of mass changing to energy.
• The total energy of a particle with mass m traveling at speed u is defined as E = γmc2, where γ = 1


1 − u
2


c2


and


u denotes the velocity of the particle.
• The rest energy of an object of mass m is E0 = mc2, meaning that mass is a form of energy. If energy is stored in
an object, its mass increases. Mass can be destroyed to release energy.


• We do not ordinarily notice the increase or decrease in mass of an object because the change in mass is so small
for a large increase in energy. The equation E2 = (pc)2 + (mc2)2 relates the relativistic total energy E and the
relativistic momentum p. At extremely high velocities, the rest energy mc2 becomes negligible, and E = pc.


236 Chapter 5 | Relativity


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




CONCEPTUAL QUESTIONS
5.1 Invariance of Physical Laws
1. Which of Einstein’s postulates of special relativityincludes a concept that does not fit with the ideas ofclassical physics? Explain.
2. Is Earth an inertial frame of reference? Is the sun?Justify your response.
3. When you are flying in a commercial jet, it may appearto you that the airplane is stationary and Earth is movingbeneath you. Is this point of view valid? Discuss briefly.


5.3 Time Dilation
4. (a) Does motion affect the rate of a clock as measuredby an observer moving with it? (b) Does motion affect howan observer moving relative to a clock measures its rate?
5. To whom does the elapsed time for a process seemto be longer, an observer moving relative to the processor an observer moving with the process? Which observermeasures the interval of proper time?
6. (a) How could you travel far into the future of Earthwithout aging significantly? (b) Could this method alsoallow you to travel into the past?


5.4 Length Contraction
7. To whom does an object seem greater in length, anobserver moving with the object or an observer movingrelative to the object? Which observer measures theobject’s proper length?
8. Relativistic effects such as time dilation and lengthcontraction are present for cars and airplanes. Why do theseeffects seem strange to us?
9. Suppose an astronaut is moving relative to Earth ata significant fraction of the speed of light. (a) Does heobserve the rate of his clocks to have slowed? (b) Whatchange in the rate of earthbound clocks does he see? (c)Does his ship seem to him to shorten? (d) What about thedistance between two stars that lie in the direction of hismotion? (e) Do he and an earthbound observer agree on hisvelocity relative to Earth?


5.7 Doppler Effect for Light
10. Explain the meaning of the terms “red shift” and “blueshift” as they relate to the relativistic Doppler effect.


11. What happens to the relativistic Doppler effect whenrelative velocity is zero? Is this the expected result?
12. Is the relativistic Doppler effect consistent with theclassical Doppler effect in the respect that λobs is larger for
motion away?
13. All galaxies farther away than about 50 × 106 ly
exhibit a red shift in their emitted light that is proportionalto distance, with those farther and farther away havingprogressively greater red shifts. What does this imply,assuming that the only source of red shift is relativemotion?


5.8 Relativistic Momentum
14. How does modern relativity modify the law ofconservation of momentum?
15. Is it possible for an external force to be acting ona system and relativistic momentum to be conserved?Explain.


5.9 Relativistic Energy
16. How are the classical laws of conservation of energyand conservation of mass modified by modern relativity?
17. What happens to the mass of water in a pot when itcools, assuming no molecules escape or are added? Is thisobservable in practice? Explain.
18. Consider a thought experiment. You place anexpanded balloon of air on weighing scales outside in theearly morning. The balloon stays on the scales and youare able to measure changes in its mass. Does the massof the balloon change as the day progresses? Discuss thedifficulties in carrying out this experiment.
19. The mass of the fuel in a nuclear reactor decreases byan observable amount as it puts out energy. Is the same truefor the coal and oxygen combined in a conventional powerplant? If so, is this observable in practice for the coal andoxygen? Explain.
20. We know that the velocity of an object with masshas an upper limit of c. Is there an upper limit on itsmomentum? Its energy? Explain.
21. Given the fact that light travels at c , can it have mass?Explain.


Chapter 5 | Relativity 237




22. If you use an Earth-based telescope to project a laserbeam onto the moon, you can move the spot across themoon’s surface at a velocity greater than the speed of light.
Does this violate modern relativity? (Note that light isbeing sent from the Earth to the moon, not across thesurface of the moon.)


PROBLEMS
5.3 Time Dilation
23. (a) What is γ if v = 0.250c? (b) If v = 0.500c?
24. (a) What is γ if v = 0.100c? (b) If v = 0.900c?
25. Particles called π -mesons are produced by accelerator
beams. If these particles travel at 2.70 × 108m/s and live
2.60 × 10−8 s when at rest relative to an observer, how
long do they live as viewed in the laboratory?
26. Suppose a particle called a kaon is created by cosmicradiation striking the atmosphere. It moves by you at
0.980c, and it lives 1.24 × 10−8 s when at rest relative
to an observer. How long does it live as you observe it?
27. A neutral π -meson is a particle that can be created
by accelerator beams. If one such particle lives
1.40 × 10−16 s as measured in the laboratory, and
0.840 × 10−16 s when at rest relative to an observer, what
is its velocity relative to the laboratory?
28. A neutron lives 900 s when at rest relative to anobserver. How fast is the neutron moving relative to anobserver who measures its life span to be 2065 s?
29. If relativistic effects are to be less than 1%, then
γ must be less than 1.01. At what relative velocity is
γ = 1.01?


30. If relativistic effects are to be less than 3%, then
γ must be less than 1.03. At what relative velocity is
γ = 1.03?


5.4 Length Contraction
31. A spaceship, 200 m long as seen on board, moves bythe Earth at 0.970c. What is its length as measured by anearthbound observer?
32. How fast would a 6.0 m-long sports car have to begoing past you in order for it to appear only 5.5 m long?
33. (a) How far does the muon in Example 5.1 travel


according to the earthbound observer? (b) How far does ittravel as viewed by an observer moving with it? Base yourcalculation on its velocity relative to the Earth and the timeit lives (proper time). (c) Verify that these two distances arerelated through length contraction γ = 3.20.
34. (a) How long would the muon in Example 5.1 havelived as observed on Earth if its velocity was 0.0500c?
(b) How far would it have traveled as observed on Earth?(c) What distance is this in the muon’s frame?
35. Unreasonable Results A spaceship is headingdirectly toward Earth at a velocity of 0.800c. The astronauton board claims that he can send a canister toward the Earthat 1.20c relative to Earth. (a) Calculate the velocity thecanister must have relative to the spaceship. (b) What isunreasonable about this result? (c) Which assumptions areunreasonable or inconsistent?


5.5 The Lorentz Transformation
36. Describe the following physical occurrences as events,that is, in the form (x, y, z, t): (a) A postman rings a doorbellof a house precisely at noon. (b) At the same time as thedoorbell is rung, a slice of bread pops out of a toaster thatis located 10 m from the door in the east direction from thedoor. (c) Ten seconds later, an airplane arrives at the airport,which is 10 km from the door in the east direction and 2 kmto the south.
37. Describe what happens to the angle α = tan(v/c),
and therefore to the transformed axes in Figure 5.17, asthe relative velocity v of the S and S′ frames of reference
approaches c.
38. Describe the shape of the world line on a space-time diagram of (a) an object that remains at rest at aspecific position along the x-axis; (b) an object that movesat constant velocity u in the x-direction; (c) an object thatbegins at rest and accelerates at a constant rate of in thepositive x-direction.
39. Aman standing still at a train station watches two boysthrowing a baseball in a moving train. Suppose the train ismoving east with a constant speed of 20 m/s and one of theboys throws the ball with a speed of 5 m/s with respect tohimself toward the other boy, who is 5 m west from him.What is the velocity of the ball as observed by the man onthe station?


238 Chapter 5 | Relativity


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




40. When observed from the sun at a particular instant,Earth and Mars appear to move in opposite directions withspeeds 108,000 km/h and 86,871 km/h, respectively. Whatis the speed of Mars at this instant when observed fromEarth?
41. A man is running on a straight road perpendicular toa train track and away from the track at a speed of 12 m/s. The train is moving with a speed of 30 m/s with respectto the track. What is the speed of the man with respect to apassenger sitting at rest in the train?
42. A man is running on a straight road that makes 30°
with the train track. The man is running in the direction onthe road that is away from the track at a speed of 12 m/s.The train is moving with a speed of 30 m/s with respect tothe track. What is the speed of the man with respect to apassenger sitting at rest in the train?
43. In a frame at rest with respect to the billiard table, abilliard ball of mass m moving with speed v strikes anotherbilliard ball of mass m at rest. The first ball comes to restafter the collision while the second ball takes off with speedv in the original direction of the motion of the first ball.This shows that momentum is conserved in this frame. (a)Now, describe the same collision from the perspective of aframe that is moving with speed v in the direction of themotion of the first ball. (b) Is the momentum conserved inthis frame?
44. In a frame at rest with respect to the billiard table,two billiard balls of same mass m are moving toward eachother with the same speed v. After the collision, the twoballs come to rest. (a) Show that momentum is conservedin this frame. (b) Now, describe the same collision fromthe perspective of a frame that is moving with speed vin the direction of the motion of the first ball. (c) Is themomentum conserved in this frame?
45. In a frame S, two events are observed: event 1: apion is created at rest at the origin and event 2: the piondisintegrates after time τ . Another observer in a frame
S′ is moving in the positive direction along the positive
x-axis with a constant speed v and observes the same twoevents in his frame. The origins of the two frames coincideat t = t′ = 0. (a) Find the positions and timings of these
two events in the frame S′ (a) according to the Galilean
transformation, and (b) according to the Lorentztransformation.


5.6 Relativistic Velocity Transformation
46. If two spaceships are heading directly toward eachother at 0.800c, at what speed must a canister be shot fromthe first ship to approach the other at 0.999c as seen by thesecond ship?


47. Two planets are on a collision course, heading directlytoward each other at 0.250c. A spaceship sent from oneplanet approaches the second at 0.750c as seen by thesecond planet. What is the velocity of the ship relative tothe first planet?
48. When a missile is shot from one spaceship towardanother, it leaves the first at 0.950c and approaches theother at 0.750c. What is the relative velocity of the twoships?
49. What is the relative velocity of two spaceships if onefires a missile at the other at 0.750c and the other observesit to approach at 0.950c?
50. Prove that for any relative velocity v between twoobservers, a beam of light sent from one to the other willapproach at speed c (provided that v is less than c, ofcourse).
51. Show that for any relative velocity v between twoobservers, a beam of light projected by one directly awayfrom the other will move away at the speed of light(provided that v is less than c, of course).


5.7 Doppler Effect for Light
52. A highway patrol officer uses a device that measuresthe speed of vehicles by bouncing radar off them andmeasuring the Doppler shift. The outgoing radar has afrequency of 100 GHz and the returning echo has afrequency 15.0 kHz higher. What is the velocity of thevehicle? Note that there are two Doppler shifts in echoes.Be certain not to round off until the end of the problem,because the effect is small.


5.8 Relativistic Momentum
53. Find the momentum of a helium nucleus having a
mass of 6.68 × 10−27 kg that is moving at 0.200c.
54. What is the momentum of an electron traveling at0.980c?
55. (a) Find the momentum of a 1.00 × 109 -kg asteroid
heading towards Earth at 30.0 km/s. (b) Find the ratio ofthis momentum to the classical momentum. (Hint: Use the
approximation that γ = 1 + (1/2)v2 /c2 at low velocities.)
56. (a) What is the momentum of a 2000-kg satelliteorbiting at 4.00 km/s? (b) Find the ratio of this momentumto the classical momentum. (Hint: Use the approximation
that γ = 1 + (1/2)v2 /c2 at low velocities.)


Chapter 5 | Relativity 239




57. What is the velocity of an electron that has a
momentum of 3.04 × 10−21 kg ·m/s ? Note that you
must calculate the velocity to at least four digits to see thedifference from c.
58. Find the velocity of a proton that has a momentum of
4.48 × 10−19 kg ·m/s.


5.9 Relativistic Energy
59. What is the rest energy of an electron, given its mass is
9.11 × 10−31 kg? Give your answer in joules and MeV.
60. Find the rest energy in joules and MeV of a proton,
given its mass is 1.67 × 10−27 kg.
61. If the rest energies of a proton and a neutron (thetwo constituents of nuclei) are 938.3 and 939.6 MeV,respectively, what is the difference in their mass inkilograms?
62. The Big Bang that began the universe is estimated
to have released 1068 J of energy. How many stars could
half this energy create, assuming the average star’s mass is
4.00 × 1030 kg ?


63. A supernova explosion of a 2.00 × 1031 kg star
produces 1.00 × 1044 J of energy. (a) How many
kilograms of mass are converted to energy in the explosion?(b) What is the ratio Δm/m of mass destroyed to the
original mass of the star?
64. (a) Using data from Potential Energy of a System(http://cnx.org/content/m58312/latest/#fs-id1165039443587) , calculate the mass converted toenergy by the fission of 1.00 kg of uranium. (b) What is theratio of mass destroyed to the original mass, Δm/m?
65. (a) Using data from Potential Energy of a System


(http://cnx.org/content/m58312/latest/#fs-id1165039443587) , calculate the amount of massconverted to energy by the fusion of 1.00 kg of hydrogen.(b) What is the ratio of mass destroyed to the original mass,
Δm/m ? (c) How does this compare with Δm/m for the
fission of 1.00 kg of uranium?
66. There is approximately 1034 J of energy available
from fusion of hydrogen in the world’s oceans. (a) If
1033 J of this energy were utilized, what would be the
decrease in mass of the oceans? (b) How great a volumeof water does this correspond to? (c) Comment on whetherthis is a significant fraction of the total mass of the oceans.
67. A muon has a rest mass energy of 105.7 MeV, and itdecays into an electron and a massless particle. (a) If allthe lost mass is converted into the electron’s kinetic energy,find γ for the electron. (b) What is the electron’s velocity?
68. A π -meson is a particle that decays into a muon and a
massless particle. The π -meson has a rest mass energy of
139.6 MeV, and the muon has a rest mass energy of 105.7MeV. Suppose the π -meson is at rest and all of the missing
mass goes into the muon’s kinetic energy. How fast will themuon move?
69. (a) Calculate the relativistic kinetic energy of a1000-kg car moving at 30.0 m/s if the speed of light wereonly 45.0 m/s. (b) Find the ratio of the relativistic kineticenergy to classical.
70. Alpha decay is nuclear decay in which a heliumnucleus is emitted. If the helium nucleus has a mass of
6.80 × 10−27 kg and is given 5.00 MeV of kinetic energy,
what is its velocity?
71. (a) Beta decay is nuclear decay in which an electronis emitted. If the electron is given 0.750 MeV of kineticenergy, what is its velocity? (b) Comment on how the highvelocity is consistent with the kinetic energy as it comparesto the rest mass energy of the electron.


ADDITIONAL PROBLEMS
72. (a) At what relative velocity is γ = 1.50? (b) At what
relative velocity is γ = 100?
73. (a) At what relative velocity is γ = 2.00? (b) At what
relative velocity is γ = 10.0?


74. Unreasonable Results (a) Find the value of γ
required for the following situation. An earthboundobserver measures 23.9 h to have passed while signals froma high-velocity space probe indicate that 24.0 h have passedon board. (b) What is unreasonable about this result? (c)Which assumptions are unreasonable or inconsistent?


240 Chapter 5 | Relativity


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




75. (a) How long does it take the astronaut in Example5.5 to travel 4.30 ly at 0.99944c (as measured by the
earthbound observer)? (b) How long does it take accordingto the astronaut? (c) Verify that these two times are relatedthrough time dilation with γ = 30.00 as given.
76. (a) How fast would an athlete need to be runningfor a 100-m race to look 100 yd long? (b) Is the answer
consistent with the fact that relativistic effects are difficultto observe in ordinary circumstances? Explain.
77. (a) Find the value of γ for the following situation. An
astronaut measures the length of his spaceship to be 100 m,while an earthbound observer measures it to be 25.0 m. (b)What is the speed of the spaceship relative to Earth?
78. A clock in a spaceship runs one-tenth the rate at whichan identical clock on Earth runs. What is the speed of thespaceship?
79. An astronaut has a heartbeat rate of 66 beats perminute as measured during his physical exam on Earth. Theheartbeat rate of the astronaut is measured when he is ina spaceship traveling at 0.5c with respect to Earth by anobserver (A) in the ship and by an observer (B) on Earth.(a) Describe an experimental method by which observer Bon Earth will be able to determine the heartbeat rate of theastronaut when the astronaut is in the spaceship. (b) Whatwill be the heartbeat rate(s) of the astronaut reported byobservers A and B?
80. A spaceship (A) is moving at speed c/2 with respectto another spaceship (B). Observers in A and B set theirclocks so that the event at (x, y, z, t) of turning on a laser inspaceship B has coordinates (0, 0, 0, 0) in A and also (0, 0,0, 0) in B. An observer at the origin of B turns on the laserat t = 0 and turns it off at t = τ in his time. What is the
time duration between on and off as seen by an observer inA?
81. Same two observers as in the preceding exercise, butnow we look at two events occurring in spaceship A. Aphoton arrives at the origin of A at its time t = 0 and
another photon arrives at (x = 1.00 m, 0, 0) at t = 0 in
the frame of ship A. (a) Find the coordinates and times ofthe two events as seen by an observer in frame B. (b) Inwhich frame are the two events simultaneous and in whichframe are they are not simultaneous?
82. Same two observers as in the preceding exercises. Arod of length 1 m is laid out on the x-axis in the frame ofB from origin to (x = 1.00 m, 0, 0). What is the length of
the rod observed by an observer in the frame of spaceshipA?


83. An observer at origin of inertial frame S sees aflashbulb go off at x = 150 km, y = 15.0 km, and
z = 1.00 km at time t = 4.5 × 10−4 s. At what time and
position in the S ′ system did the flash occur, if S ′ is
moving along shared x-direction with S at a velocity
v = 0.6c?


84. An observer sees two events 1.5 × 10−8 s apart at
a separation of 800 m. How fast must a second observerbe moving relative to the first to see the two events occursimultaneously?
85. An observer standing by the railroad tracks sees twobolts of lightning strike the ends of a 500-m-long trainsimultaneously at the instant the middle of the train passeshim at 50 m/s. Use the Lorentz transformation to find thetime between the lightning strikes as measured by apassenger seated in the middle of the train.
86. Two astronomical events are observed from Earth tooccur at a time of 1 s apart and a distance separation of
1.5 × 109m from each other. (a) Determine whether
separation of the two events is space like or time like. (b)State what this implies about whether it is consistent withspecial relativity for one event to have caused the other?
87. Two astronomical events are observed from Earth tooccur at a time of 0.30 s apart and a distance separation of
2.0 × 109m from each other. How fast must a spacecraft
travel from the site of one event toward the other to makethe events occur at the same time when measured in theframe of reference of the spacecraft?
88. A spacecraft starts from being at rest at the origin andaccelerates at a constant rate g, as seen from Earth, taken tobe an inertial frame, until it reaches a speed of c/2. (a) Showthat the increment of proper time is related to the elapsed
time in Earth’s frame by: dτ = 1 − v2/c2dt.
(b) Find an expression for the elapsed time to reach speedc/2 as seen in Earth’s frame. (c) Use the relationship in (a)to obtain a similar expression for the elapsed proper time toreach c/2 as seen in the spacecraft, and determine the ratioof the time seen from Earth with that on the spacecraft toreach the final speed.
89. (a) All but the closest galaxies are receding from our
own Milky Way Galaxy. If a galaxy 12.0 × 109 ly away
is receding from us at 0.900c, at what velocity relative tous must we send an exploratory probe to approach the othergalaxy at 0.990c as measured from that galaxy? (b) Howlong will it take the probe to reach the other galaxy asmeasured from Earth? You may assume that the velocity ofthe other galaxy remains constant. (c) How long will it then


Chapter 5 | Relativity 241




take for a radio signal to be beamed back? (All of this ispossible in principle, but not practical.)
90. Suppose a spaceship heading straight toward the Earthat 0.750c can shoot a canister at 0.500c relative to the ship.(a) What is the velocity of the canister relative to Earth, if itis shot directly at Earth? (b) If it is shot directly away fromEarth?
91. Repeat the preceding problem with the ship headingdirectly away from Earth.
92. If a spaceship is approaching the Earth at 0.100c anda message capsule is sent toward it at 0.100c relative toEarth, what is the speed of the capsule relative to the ship?
93. (a) Suppose the speed of light were only 3000 m/s.A jet fighter moving toward a target on the ground at 800m/s shoots bullets, each having a muzzle velocity of 1000m/s. What are the bullets’ velocity relative to the target?(b) If the speed of light was this small, would you observerelativistic effects in everyday life? Discuss.
94. If a galaxy moving away from the Earth has a speedof 1000 km/s and emits 656 nm light characteristic ofhydrogen (the most common element in the universe). (a)What wavelength would we observe on Earth? (b) Whattype of electromagnetic radiation is this? (c) Why is thespeed of Earth in its orbit negligible here?
95. A space probe speeding towards the nearest star movesat 0.250c and sends radio information at a broadcast
frequency of 1.00 GHz. What frequency is received onEarth?
96. Near the center of our galaxy, hydrogen gas is movingdirectly away from us in its orbit about a black hole. Wereceive 1900 nm electromagnetic radiation and know that itwas 1875 nm when emitted by the hydrogen gas. What isthe speed of the gas?
97. (a) Calculate the speed of a 1.00-µg particle of dust
that has the same momentum as a proton moving at 0.999c.(b) What does the small speed tell us about the mass ofa proton compared to even a tiny amount of macroscopicmatter?
98. (a) Calculate γ for a proton that has a momentum of
1.00 kg ·m/s. (b) What is its speed? Such protons form a
rare component of cosmic radiation with uncertain origins.
99. Show that the relativistic form of Newton’s second law
is (a) F = mdu


dt
1



⎝1 − u


2/c2⎞⎠
3/2


; (b) Find the force needed


to accelerate a mass of 1 kg by 1 m/s2 when it is traveling
at a velocity of c/2.
100. A positron is an antimatter version of the electron,having exactly the same mass. When a positron and anelectron meet, they annihilate, converting all of their massinto energy. (a) Find the energy released, assumingnegligible kinetic energy before the annihilation. (b) If thisenergy is given to a proton in the form of kinetic energy,what is its velocity? (c) If this energy is given to anotherelectron in the form of kinetic energy, what is its velocity?
101. What is the kinetic energy in MeV of a π-meson
that lives 1.40 × 10−16 s as measured in the laboratory,
and 0.840 × 10−16 s when at rest relative to an observer,
given that its rest energy is 135 MeV?
102. Find the kinetic energy in MeV of a neutron with ameasured life span of 2065 s, given its rest energy is 939.6MeV, and rest life span is 900s.
103. (a) Show that (pc)2 /(mc2)2 = γ2 − 1. This means
that at large velocities pc > > mc2. (b) Is E ≈ pc when
γ = 30.0, as for the astronaut discussed in the twin
paradox?
104. One cosmic ray neutron has a velocity of 0.250c
relative to the Earth. (a) What is the neutron’s total energyin MeV? (b) Find its momentum. (c) Is E ≈ pc in this
situation? Discuss in terms of the equation given in part (a)of the previous problem.
105. What is γ for a proton having a mass energy of 938.3
MeV accelerated through an effective potential of 1.0 TV(teravolt)?
106. (a) What is the effective accelerating potential forelectrons at the Stanford Linear Accelerator, if
γ = 1.00 × 105 for them? (b) What is their total energy
(nearly the same as kinetic in this case) in GeV?
107. (a) Using data from Potential Energy of aSystem (http://cnx.org/content/m58312/latest/#fs-id1165039443587) , find the mass destroyed when theenergy in a barrel of crude oil is released. (b) Given thesebarrels contain 200 liters and assuming the density of crude
oil is 750kg/m3, what is the ratio of mass destroyed to
original mass, Δm/m?
108. (a) Calculate the energy released by the destructionof 1.00 kg of mass. (b) How many kilograms could be lifted


242 Chapter 5 | Relativity


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




to a 10.0 km height by this amount of energy?
109. A Van de Graaff accelerator utilizes a 50.0 MVpotential difference to accelerate charged particles such asprotons. (a) What is the velocity of a proton accelerated bysuch a potential? (b) An electron?
110. Suppose you use an average of 500 kW · h of
electric energy per month in your home. (a) How longwould 1.00 g of mass converted to electric energy with anefficiency of 38.0% last you? (b) How many homes couldbe supplied at the 500 kW · h per month rate for one year
by the energy from the described mass conversion?
111. (a) A nuclear power plant converts energy fromnuclear fission into electricity with an efficiency of 35.0%.How much mass is destroyed in one year to produce acontinuous 1000 MW of electric power? (b) Do you think itwould be possible to observe this mass loss if the total mass
of the fuel is 104 kg?
112. Nuclear-powered rockets were researched for someyears before safety concerns became paramount. (a) What


fraction of a rocket’s mass would have to be destroyedto get it into a low Earth orbit, neglecting the decreasein gravity? (Assume an orbital altitude of 250 km, andcalculate both the kinetic energy (classical) and thegravitational potential energy needed.) (b) If the ship has a
mass of 1.00 × 105 kg (100 tons), what total yield nuclear
explosion in tons of TNT is needed?
113. The sun produces energy at a rate of 3.85 × 1026 W
by the fusion of hydrogen. About 0.7% of each kilogramof hydrogen goes into the energy generated by the Sun.(a) How many kilograms of hydrogen undergo fusion eachsecond? (b) If the sun is 90.0% hydrogen and half of thiscan undergo fusion before the sun changes character, howlong could it produce energy at its current rate? (c) Howmany kilograms of mass is the sun losing per second? (d)What fraction of its mass will it have lost in the time foundin part (b)?
114. Show that E2 − p2 c2 for a particle is invariant
under Lorentz transformations.


Chapter 5 | Relativity 243




244 Chapter 5 | Relativity


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




6 | PHOTONS AND MATTERWAVES


Figure 6.1 In this image of pollen taken with an electron microscope, the bean-shaped grains are about 50µm long. Electron
microscopes can have a much higher resolving power than a conventional light microscope because electron wavelengths can be100,000 times shorter than the wavelengths of visible-light photons. (credit: modification of work by Dartmouth College ElectronMicroscope Facility)


Chapter Outline
6.1 Blackbody Radiation
6.2 Photoelectric Effect
6.3 The Compton Effect
6.4 Bohr’s Model of the Hydrogen Atom
6.5 De Broglie’s Matter Waves
6.6 Wave-Particle Duality


Introduction
Two of the most revolutionary concepts of the twentieth century were the description of light as a collection of particles,and the treatment of particles as waves. These wave properties of matter have led to the discovery of technologies such aselectron microscopy, which allows us to examine submicroscopic objects such as grains of pollen, as shown above.
In this chapter, you will learn about the energy quantum, a concept that was introduced in 1900 by the German physicistMax Planck to explain blackbody radiation. We discuss how Albert Einstein extended Planck’s concept to a quantum oflight (a “photon”) to explain the photoelectric effect. We also show how American physicist Arthur H. Compton used thephoton concept in 1923 to explain wavelength shifts observed in X-rays. After a discussion of Bohr’s model of hydrogen, wedescribe how matter waves were postulated in 1924 by Louis-Victor de Broglie to justify Bohr’s model and we examine theexperiments conducted in 1923–1927 by Clinton Davisson and Lester Germer that confirmed the existence of de Broglie’smatter waves.


Chapter 6 | Photons and Matter Waves 245




6.1 | Blackbody Radiation
Learning Objectives


By the end of this section you will be able to:
• Apply Wien’s and Stefan’s laws to analyze radiation emitted by a blackbody
• Explain Planck’s hypothesis of energy quanta


All bodies emit electromagnetic radiation over a range of wavelengths. In an earlier chapter, we learned that a cooler bodyradiates less energy than a warmer body. We also know by observation that when a body is heated and its temperature rises,the perceived wavelength of its emitted radiation changes from infrared to red, and then from red to orange, and so forth.As its temperature rises, the body glows with the colors corresponding to ever-smaller wavelengths of the electromagneticspectrum. This is the underlying principle of the incandescent light bulb: A hot metal filament glows red, and when heatingcontinues, its glow eventually covers the entire visible portion of the electromagnetic spectrum. The temperature (T) of theobject that emits radiation, or the emitter, determines the wavelength at which the radiated energy is at its maximum. Forexample, the Sun, whose surface temperature is in the range between 5000 K and 6000 K, radiates most strongly in a rangeof wavelengths about 560 nm in the visible part of the electromagnetic spectrum. Your body, when at its normal temperatureof about 300 K, radiates most strongly in the infrared part of the spectrum.
Radiation that is incident on an object is partially absorbed and partially reflected. At thermodynamic equilibrium, the rateat which an object absorbs radiation is the same as the rate at which it emits it. Therefore, a good absorber of radiation (anyobject that absorbs radiation) is also a good emitter. A perfect absorber absorbs all electromagnetic radiation incident on it;such an object is called a blackbody.
Although the blackbody is an idealization, because no physical object absorbs 100% of incident radiation, we can constructa close realization of a blackbody in the form of a small hole in the wall of a sealed enclosure known as a cavity radiator,as shown in Figure 6.2. The inside walls of a cavity radiator are rough and blackened so that any radiation that entersthrough a tiny hole in the cavity wall becomes trapped inside the cavity. At thermodynamic equilibrium (at temperature T),the cavity walls absorb exactly as much radiation as they emit. Furthermore, inside the cavity, the radiation entering thehole is balanced by the radiation leaving it. The emission spectrum of a blackbody can be obtained by analyzing the lightradiating from the hole. Electromagnetic waves emitted by a blackbody are called blackbody radiation.


Figure 6.2 A blackbody is physically realized by a small holein the wall of a cavity radiator.


The intensity I(λ, T) of blackbody radiation depends on the wavelength λ of the emitted radiation and on the temperature
T of the blackbody (Figure 6.3). The function I(λ, T) is the power intensity that is radiated per unit wavelength; in
other words, it is the power radiated per unit area of the hole in a cavity radiator per unit wavelength. According to thisdefinition, I(λ, T)dλ is the power per unit area that is emitted in the wavelength interval from λ to λ + dλ. The intensity


246 Chapter 6 | Photons and Matter Waves


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




distribution among wavelengths of radiation emitted by cavities was studied experimentally at the end of the nineteenthcentury. Generally, radiation emitted by materials only approximately follows the blackbody radiation curve (Figure 6.4);however, spectra of common stars do follow the blackbody radiation curve very closely.


Figure 6.3 The intensity of blackbody radiation versus the wavelength of the emitted radiation. Each curvecorresponds to a different blackbody temperature, starting with a low temperature (the lowest curve) to a hightemperature (the highest curve).


Figure 6.4 The spectrum of radiation emitted from a quartzsurface (blue curve) and the blackbody radiation curve (blackcurve) at 600 K.


Two important laws summarize the experimental findings of blackbody radiation:Wien’s displacement law and Stefan’s law.Wien’s displacement law is illustrated in Figure 6.3 by the curve connecting the maxima on the intensity curves. In these


Chapter 6 | Photons and Matter Waves 247




6.1


curves, we see that the hotter the body, the shorter the wavelength corresponding to the emission peak in the radiation curve.Quantitatively, Wien’s law reads


(6.1)λmaxT = 2.898 × 10−3m ·K


where λmax is the position of the maximum in the radiation curve. In other words, λmax is the wavelength at which a
blackbody radiates most strongly at a given temperature T. Note that in Equation 6.1, the temperature is in kelvins. Wien’sdisplacement law allows us to estimate the temperatures of distant stars by measuring the wavelength of radiation they emit.
Example 6.1


Temperatures of Distant Stars
On a clear evening during the winter months, if you happen to be in the Northern Hemisphere and look up at thesky, you can see the constellation Orion (The Hunter). One star in this constellation, Rigel, flickers in a blue colorand another star, Betelgeuse, has a reddish color, as shown in Figure 6.5. Which of these two stars is cooler,Betelgeuse or Rigel?
Strategy
We treat each star as a blackbody. Then according to Wien’s law, its temperature is inversely proportional to the
wavelength of its peak intensity. The wavelength λmax(blue) of blue light is shorter than the wavelength λmax(red) of
red light. Even if we do not know the precise wavelengths, we can still set up a proportion.
Solution
Writing Wien’s law for the blue star and for the red star, we have


(6.2)λmax(red)T(red) = 2.898 × 10−3m ·K = λmax(blue)T(blue)
When simplified, Equation 6.2 gives


(6.3)
T(red) =


λmax
(blue)


λmax
(red)


T(blue) < T(blue)


Therefore, Betelgeuse is cooler than Rigel.
Significance
Note that Wien’s displacement law tells us that the higher the temperature of an emitting body, the shorter thewavelength of the radiation it emits. The qualitative analysis presented in this example is generally valid forany emitting body, whether it is a big object such as a star or a small object such as the glowing filament in anincandescent lightbulb.


Check Your Understanding The flame of a peach-scented candle has a yellowish color and the flame ofa Bunsen’s burner in a chemistry lab has a bluish color. Which flame has a higher temperature?


248 Chapter 6 | Photons and Matter Waves


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 6.5 In the Orion constellation, the red star Betelgeuse, which usually takes on a yellowish tint, appears as thefigure’s right shoulder (in the upper left). The giant blue star on the bottom right is Rigel, which appears as the hunter’s leftfoot. (credit left: modification of work by NASA c/o Matthew Spinelli)


The second experimental relation is Stefan’s law, which concerns the total power of blackbody radiation emitted across theentire spectrum of wavelengths at a given temperature. In Figure 6.3, this total power is represented by the area under theblackbody radiation curve for a given T. As the temperature of a blackbody increases, the total emitted power also increases.Quantitatively, Stefan’s law expresses this relation as


(6.4)P(T) = σAT 4


where A is the surface area of a blackbody, T is its temperature (in kelvins), and σ is the Stefan–Boltzmann constant,
σ = 5.670 × 10−8W/(m2 · K4). Stefan’s law enables us to estimate how much energy a star is radiating by remotely
measuring its temperature.
Example 6.2


Power Radiated by Stars
A star such as our Sun will eventually evolve to a “red giant” star and then to a “white dwarf” star. A typical white
dwarf is approximately the size of Earth, and its surface temperature is about 2.5 × 104K. A typical red giant
has a surface temperature of 3.0 × 103K and a radius ~100,000 times larger than that of a white dwarf. What is
the average radiated power per unit area and the total power radiated by each of these types of stars? How do theycompare?
Strategy
If we treat the star as a blackbody, then according to Stefan’s law, the total power that the star radiates isproportional to the fourth power of its temperature. To find the power radiated per unit area of the surface, we donot need to make any assumptions about the shape of the star because P/A depends only on temperature. However,to compute the total power, we need to make an assumption that the energy radiates through a spherical surface
enclosing the star, so that the surface area is A = 4πR2, where R is its radius.


Chapter 6 | Photons and Matter Waves 249




6.2


6.3


Solution
A simple proportion based on Stefan’s law gives


(6.5)
Pdwarf /Adwarf
Pgiant /Agiant


=
σTdwarf


4


σTgiant
4


=




Tdwarf
Tgiant






4


=


2.5 × 104


3.0 × 103



4


= 4820


The power emitted per unit area by a white dwarf is about 5000 times that the power emitted by a red giant.
Denoting this ratio by a = 4.8 × 103, Equation 6.5 gives


(6.6)
Pdwarf
Pgiant


= a
Adwarf
Agiant


= a
4πRdwarf


2


4πRgiant
2


= a




Rdwarf
Rgiant






2


= 4.8 × 103



⎜ Rdwarf
105Rdwarf






2


= 4.8 × 10−7


We see that the total power emitted by a white dwarf is a tiny fraction of the total power emitted by a red giant.Despite its relatively lower temperature, the overall power radiated by a red giant far exceeds that of the whitedwarf because the red giant has a much larger surface area. To estimate the absolute value of the emitted powerper unit area, we again use Stefan’s law. For the white dwarf, we obtain
(6.7)Pdwarf


Adwarf
= σTdwarf


4 = 5.670 × 10−8 W
m2 · K4



⎝2.5 × 10


4K⎞⎠
4
= 2.2 × 1010W/m2


The analogous result for the red giant is obtained by scaling the result for a white dwarf:
(6.8)Pgiant


Agiant
= 2.2 × 10


10


4.82 × 103
W
m2


= 4.56 × 106 W
m2


≅ 4.6 × 106 W
m2


Significance
To estimate the total power emitted by a white dwarf, in principle, we could use Equation 6.7. However, to findits surface area, we need to know the average radius, which is not given in this example. Therefore, the solutionstops here. The same is also true for the red giant star.


Check Your Understanding An iron poker is being heated. As its temperature rises, the poker begins toglow—first dull red, then bright red, then orange, and then yellow. Use either the blackbody radiation curve orWien’s law to explain these changes in the color of the glow.


Check Your Understanding Suppose that two stars, α and β, radiate exactly the same total power. If
the radius of star α is three times that of star β, what is the ratio of the surface temperatures of these stars?
Which one is hotter?


The term “blackbody” was coined by Gustav R. Kirchhoff in 1862. The blackbody radiation curve was knownexperimentally, but its shape eluded physical explanation until the year 1900. The physical model of a blackbody attemperature T is that of the electromagnetic waves enclosed in a cavity (see Figure 6.2) and at thermodynamic equilibriumwith the cavity walls. The waves can exchange energy with the walls. The objective here is to find the energy densitydistribution among various modes of vibration at various wavelengths (or frequencies). In other words, we want to knowhow much energy is carried by a single wavelength or a band of wavelengths. Once we know the energy distribution, wecan use standard statistical methods (similar to those studied in a previous chapter) to obtain the blackbody radiation curve,Stefan’s law, and Wien’s displacement law. When the physical model is correct, the theoretical predictions should be thesame as the experimental curves.
In a classical approach to the blackbody radiation problem, in which radiation is treated as waves (as you have studied inprevious chapters), the modes of electromagnetic waves trapped in the cavity are in equilibrium and continually exchangetheir energies with the cavity walls. There is no physical reason why a wave should do otherwise: Any amount of energy canbe exchanged, either by being transferred from the wave to the material in the wall or by being received by the wave fromthe material in the wall. This classical picture is the basis of the model developed by Lord Rayleigh and, independently,by Sir James Jeans. The result of this classical model for blackbody radiation curves is known as the Rayleigh–Jeans law.


250 Chapter 6 | Photons and Matter Waves


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




However, as shown in Figure 6.6, the Rayleigh–Jeans law fails to correctly reproduce experimental results. In the limit ofshort wavelengths, the Rayleigh–Jeans law predicts infinite radiation intensity, which is inconsistent with the experimentalresults in which radiation intensity has finite values in the ultraviolet region of the spectrum. This divergence between theresults of classical theory and experiments, which came to be called the ultraviolet catastrophe, shows how classical physicsfails to explain the mechanism of blackbody radiation.


Figure 6.6 The ultraviolet catastrophe: The Rayleigh–Jeanslaw does not explain the observed blackbody emission spectrum.


The blackbody radiation problem was solved in 1900 by Max Planck. Planck used the same idea as the Rayleigh–Jeansmodel in the sense that he treated the electromagnetic waves between the walls inside the cavity classically, and assumedthat the radiation is in equilibrium with the cavity walls. The innovative idea that Planck introduced in his model is theassumption that the cavity radiation originates from atomic oscillations inside the cavity walls, and that these oscillationscan have only discrete values of energy. Therefore, the radiation trapped inside the cavity walls can exchange energy withthe walls only in discrete amounts. Planck’s hypothesis of discrete energy values, which he called quanta, assumes thatthe oscillators inside the cavity walls have quantized energies. This was a brand new idea that went beyond the classicalphysics of the nineteenth century because, as you learned in a previous chapter, in the classical picture, the energy of anoscillator can take on any continuous value. Planck assumed that the energy of an oscillator ( En ) can have only discrete,
or quantized, values:


(6.9)En = nh f , where n = 1, 2, 3, ...


In Equation 6.9, f is the frequency of Planck’s oscillator. The natural number n that enumerates these discrete energies iscalled a quantum number. The physical constant h is called Planck’s constant:


(6.10)h = 6.626 × 10−34 J · s = 4.136 × 10−15 eV · s


Each discrete energy value corresponds to a quantum state of a Planck oscillator. Quantum states are enumerated byquantum numbers. For example, when Planck’s oscillator is in its first n = 1 quantum state, its energy is E1 = h f ; when
it is in the n = 2 quantum state, its energy is E2 = 2h f ; when it is in the n = 3 quantum state, E3 = 3h f ; and so on.
Note that Equation 6.9 shows that there are infinitely many quantum states, which can be represented as a sequence {hf,2hf, 3hf,…, (n – 1)hf, nhf, (n + 1)hf,…}. Each two consecutive quantum states in this sequence are separated by an energyjump, ΔE = h f . An oscillator in the wall can receive energy from the radiation in the cavity (absorption), or it can give
away energy to the radiation in the cavity (emission). The absorption process sends the oscillator to a higher quantum state,and the emission process sends the oscillator to a lower quantum state. Whichever way this exchange of energy goes, thesmallest amount of energy that can be exchanged is hf. There is no upper limit to how much energy can be exchanged, butwhatever is exchanged must be an integer multiple of hf. If the energy packet does not have this exact amount, it is neither


Chapter 6 | Photons and Matter Waves 251




absorbed nor emitted at the wall of the blackbody.
Planck’s Quantum Hypothesis
Planck’s hypothesis of energy quanta states that the amount of energy emitted by the oscillator is carried by thequantum of radiation, ΔE :


ΔE = h f


Recall that the frequency of electromagnetic radiation is related to its wavelength and to the speed of light by thefundamental relation f λ = c. This means that we can express Equation 6.10 equivalently in terms of wavelength λ.
When included in the computation of the energy density of a blackbody, Planck’s hypothesis gives the following theoreticalexpression for the power intensity of emitted radiation per unit wavelength:


(6.11)
I(λ, T) = 2πhc


2


λ5
1


e
hc /λkBT − 1


where c is the speed of light in vacuum and kB is Boltzmann’s constant, kB = 1.380 × 10−23 J/K. The theoretical formula
expressed in Equation 6.11 is called Planck’s blackbody radiation law. This law is in agreement with the experimentalblackbody radiation curve (see Figure 6.7). In addition, Wien’s displacement law and Stefan’s law can both be derivedfrom Equation 6.11. To derive Wien’s displacement law, we use differential calculus to find the maximum of the radiationintensity curve I(λ, T). To derive Stefan’s law and find the value of the Stefan–Boltzmann constant, we use integral
calculus and integrate I(λ, T) to find the total power radiated by a blackbody at one temperature in the entire spectrum of
wavelengths from λ = 0 to λ = ∞. This derivation is left as an exercise later in this chapter.


Figure 6.7 Planck’s theoretical result (continuous curve) andthe experimental blackbody radiation curve (dots).


Example 6.3
Planck’s Quantum Oscillator
A quantum oscillator in the cavity wall in Figure 6.2 is vibrating at a frequency of 5.0 × 1014Hz. Calculate
the spacing between its energy levels.
Strategy
Energy states of a quantum oscillator are given by Equation 6.9. The energy spacing ΔE is obtained by finding
the energy difference between two adjacent quantum states for quantum numbers n + 1 and n.


252 Chapter 6 | Photons and Matter Waves


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




6.4


6.5


Solution
We can substitute the given frequency and Planck’s constant directly into the equation:


ΔE = En + 1 − En = (n + 1)h f − nh f = h f = (6.626 × 10
−34 J · s)(5.0 × 1014Hz) = 3.3 × 10−19 J


Significance
Note that we do not specify what kind of material was used to build the cavity. Here, a quantum oscillator is atheoretical model of an atom or molecule of material in the wall.


Check Your Understanding A molecule is vibrating at a frequency of 5.0 × 1014Hz. What is the
smallest spacing between its vibrational energy levels?


Example 6.4
Quantum Theory Applied to a Classical Oscillator
A 1.0-kg mass oscillates at the end of a spring with a spring constant of 1000 N/m. The amplitude of theseoscillations is 0.10 m. Use the concept of quantization to find the energy spacing for this classical oscillator. Isthe energy quantization significant for macroscopic systems, such as this oscillator?
Strategy
We use Equation 6.10 as though the system were a quantum oscillator, but with the frequency f of the massvibrating on a spring. To evaluate whether or not quantization has a significant effect, we compare the quantumenergy spacing with the macroscopic total energy of this classical oscillator.
Solution
For the spring constant, k = 1.0 × 103N/m, the frequency f of the mass, m = 1.0kg, is


f = 1


k
m =


1


1.0 × 103N/m
1.0kg


≃ 5.0 Hz


The energy quantum that corresponds to this frequency is
ΔE = h f = (6.626 × 10−34 J · s)(5.0Hz) = 3.3 × 10−33 J


When vibrations have amplitude A = 0.10m, the energy of oscillations is
E = 1


2
kA2 = 1


2
(1000N/m)(0.1m)2 = 5.0J


Significance
Thus, for a classical oscillator, we have ΔE /E ≈ 10−34. We see that the separation of the energy levels is
immeasurably small. Therefore, for all practical purposes, the energy of a classical oscillator takes on continuousvalues. This is why classical principles may be applied to macroscopic systems encountered in everyday lifewithout loss of accuracy.


Check Your Understanding Would the result in Example 6.4 be different if the mass were not 1.0 kgbut a tiny mass of 1.0 µg, and the amplitude of vibrations were 0.10 µm?


When Planck first published his result, the hypothesis of energy quanta was not taken seriously by the physics communitybecause it did not follow from any established physics theory at that time. It was perceived, even by Planck himself, asa useful mathematical trick that led to a good theoretical “fit” to the experimental curve. This perception was changed in1905 when Einstein published his explanation of the photoelectric effect, in which he gave Planck’s energy quantum a newmeaning: that of a particle of light.


Chapter 6 | Photons and Matter Waves 253




6.2 | Photoelectric Effect
Learning Objectives


By the end of this section you will be able to:
• Describe physical characteristics of the photoelectric effect
• Explain why the photoelectric effect cannot be explained by classical physics
• Describe how Einstein’s idea of a particle of radiation explains the photoelectric effect


When a metal surface is exposed to a monochromatic electromagnetic wave of sufficiently short wavelength (orequivalently, above a threshold frequency), the incident radiation is absorbed and the exposed surface emits electrons. Thisphenomenon is known as the photoelectric effect. Electrons that are emitted in this process are called photoelectrons.
The experimental setup to study the photoelectric effect is shown schematically in Figure 6.8. The target material servesas the anode, which becomes the emitter of photoelectrons when it is illuminated by monochromatic radiation. We call thiselectrode the photoelectrode. Photoelectrons are collected at the cathode, which is kept at a lower potential with respectto the anode. The potential difference between the electrodes can be increased or decreased, or its polarity can be reversed.The electrodes are enclosed in an evacuated glass tube so that photoelectrons do not lose their kinetic energy on collisionswith air molecules in the space between electrodes.
When the target material is not exposed to radiation, no current is registered in this circuit because the circuit is broken(note, there is a gap between the electrodes). But when the target material is connected to the negative terminal of a batteryand exposed to radiation, a current is registered in this circuit; this current is called the photocurrent. Suppose that we nowreverse the potential difference between the electrodes so that the target material now connects with the positive terminalof a battery, and then we slowly increase the voltage. The photocurrent gradually dies out and eventually stops flowingcompletely at some value of this reversed voltage. The potential difference at which the photocurrent stops flowing is calledthe stopping potential.


254 Chapter 6 | Photons and Matter Waves


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 6.8 An experimental setup to study the photoelectric effect. Theanode and cathode are enclosed in an evacuated glass tube. The voltmetermeasures the electric potential difference between the electrodes, and theammeter measures the photocurrent. The incident radiation ismonochromatic.
Characteristics of the Photoelectric Effect
The photoelectric effect has three important characteristics that cannot be explained by classical physics: (1) the absence ofa lag time, (2) the independence of the kinetic energy of photoelectrons on the intensity of incident radiation, and (3) thepresence of a cut-off frequency. Let’s examine each of these characteristics.
The absence of lag time
When radiation strikes the target material in the electrode, electrons are emitted almost instantaneously, even at very lowintensities of incident radiation. This absence of lag time contradicts our understanding based on classical physics. Classicalphysics predicts that for low-energy radiation, it would take significant time before irradiated electrons could gain sufficientenergy to leave the electrode surface; however, such an energy buildup is not observed.
The intensity of incident radiation and the kinetic energy of photoelectrons
Typical experimental curves are shown in Figure 6.9, in which the photocurrent is plotted versus the applied potentialdifference between the electrodes. For the positive potential difference, the current steadily grows until it reaches a plateau.Furthering the potential increase beyond this point does not increase the photocurrent at all. A higher intensity of radiationproduces a higher value of photocurrent. For the negative potential difference, as the absolute value of the potentialdifference increases, the value of the photocurrent decreases and becomes zero at the stopping potential. For any intensityof incident radiation, whether the intensity is high or low, the value of the stopping potential always stays at one value.
To understand why this result is unusual from the point of view of classical physics, we first have to analyze theenergy of photoelectrons. A photoelectron that leaves the surface has kinetic energy K. It gained this energy from theincident electromagnetic wave. In the space between the electrodes, a photoelectron moves in the electric potential andits energy changes by the amount qΔV , where ΔV is the potential difference and q = −e. Because no forces are
present but electric force, by applying the work-energy theorem, we obtain the energy balance ΔK − eΔV = 0 for
the photoelectron, where ΔK is the change in the photoelectron’s kinetic energy. When the stopping potential −ΔVs
is applied, the photoelectron loses its initial kinetic energy Ki and comes to rest. Thus, its energy balance becomes


Chapter 6 | Photons and Matter Waves 255




(0 − Ki) − e(−ΔVs) = 0, so that Ki = eΔVs. In the presence of the stopping potential, the largest kinetic energy Kmax
that a photoelectron can have is its initial kinetic energy, which it has at the surface of the photoelectrode. Therefore, thelargest kinetic energy of photoelectrons can be directly measured by measuring the stopping potential:


(6.12)Kmax = eΔVs.


At this point we can see where the classical theory is at odds with the experimental results. In classical theory, thephotoelectron absorbs electromagnetic energy in a continuous way; this means that when the incident radiation has a highintensity, the kinetic energy in Equation 6.12 is expected to be high. Similarly, when the radiation has a low intensity,the kinetic energy is expected to be low. But the experiment shows that the maximum kinetic energy of photoelectrons isindependent of the light intensity.


Figure 6.9 The detected photocurrent plotted versus theapplied potential difference shows that for any intensity ofincident radiation, whether the intensity is high (upper curve) orlow (lower curve), the value of the stopping potential is alwaysthe same.
The presence of a cut-off frequency
For any metal surface, there is a minimum frequency of incident radiation below which photocurrent does not occur. Thevalue of this cut-off frequency for the photoelectric effect is a physical property of the metal: Different materials havedifferent values of cut-off frequency. Experimental data show a typical linear trend (see Figure 6.10). The kinetic energy ofphotoelectrons at the surface grows linearly with the increasing frequency of incident radiation. Measurements for all metalsurfaces give linear plots with one slope. None of these observed phenomena is in accord with the classical understandingof nature. According to the classical description, the kinetic energy of photoelectrons should not depend on the frequencyof incident radiation at all, and there should be no cut-off frequency. Instead, in the classical picture, electrons receiveenergy from the incident electromagnetic wave in a continuous way, and the amount of energy they receive depends onlyon the intensity of the incident light and nothing else. So in the classical understanding, as long as the light is shining, thephotoelectric effect is expected to continue.


256 Chapter 6 | Photons and Matter Waves


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 6.10 Kinetic energy of photoelectrons at the surfaceversus the frequency of incident radiation. The photoelectriceffect can only occur above the cut-off frequency fc.
Measurements for all metal surfaces give linear plots with oneslope. Each metal surface has its own cut-off frequency.


The Work Function
The photoelectric effect was explained in 1905 by A. Einstein. Einstein reasoned that if Planck’s hypothesis about energyquanta was correct for describing the energy exchange between electromagnetic radiation and cavity walls, it should alsowork to describe energy absorption from electromagnetic radiation by the surface of a photoelectrode. He postulated that anelectromagnetic wave carries its energy in discrete packets. Einstein’s postulate goes beyond Planck’s hypothesis because itstates that the light itself consists of energy quanta. In other words, it states that electromagnetic waves are quantized.
In Einstein’s approach, a beam of monochromatic light of frequency f is made of photons. A photon is a particle of light.Each photon moves at the speed of light and carries an energy quantum E f . A photon’s energy depends only on its
frequency f. Explicitly, the energy of a photon is


(6.13)E f = h f


where h is Planck’s constant. In the photoelectric effect, photons arrive at the metal surface and each photon gives away
all of its energy to only one electron on the metal surface. This transfer of energy from photon to electron is of the “all ornothing” type, and there are no fractional transfers in which a photon would lose only part of its energy and survive. Theessence of a quantum phenomenon is either a photon transfers its entire energy and ceases to exist or there is no transferat all. This is in contrast with the classical picture, where fractional energy transfers are permitted. Having this quantumunderstanding, the energy balance for an electron on the surface that receives the energy E f from a photon is


E f = Kmax + ϕ


where Kmax is the kinetic energy, given by Equation 6.12, that an electron has at the very instant it gets detached from
the surface. In this energy balance equation, ϕ is the energy needed to detach a photoelectron from the surface. This energy
ϕ is called the work function of the metal. Each metal has its characteristic work function, as illustrated in Table 6.1. To
obtain the kinetic energy of photoelectrons at the surface, we simply invert the energy balance equation and use Equation6.13 to express the energy of the absorbed photon. This gives us the expression for the kinetic energy of photoelectrons,which explicitly depends on the frequency of incident radiation:


(6.14)Kmax = h f − ϕ.


This equation has a simple mathematical form but its physics is profound. We can now elaborate on the physical meaningbehind Equation 6.14.


Chapter 6 | Photons and Matter Waves 257




Typical Values of the Work Function for Some Common Metals
Metal ϕ (eV)
Na 2.46
Al 4.08
Pb 4.14
Zn 4.31
Fe 4.50
Cu 4.70
Ag 4.73
Pt 6.35


Table 6.1
In Einstein’s interpretation, interactions take place between individual electrons and individual photons. The absence ofa lag time means that these one-on-one interactions occur instantaneously. This interaction time cannot be increased bylowering the light intensity. The light intensity corresponds to the number of photons arriving at the metal surface per unittime. Even at very low light intensities, the photoelectric effect still occurs because the interaction is between one electronand one photon. As long as there is at least one photon with enough energy to transfer it to a bound electron, a photoelectronwill appear on the surface of the photoelectrode.
The existence of the cut-off frequency fc for the photoelectric effect follows from Equation 6.14 because the kinetic
energy Kmax of the photoelectron can take only positive values. This means that there must be some threshold frequency
for which the kinetic energy is zero, 0 = h fc − ϕ. In this way, we obtain the explicit formula for cut-off frequency:


(6.15)fc = ϕh .


Cut-off frequency depends only on the work function of the metal and is in direct proportion to it. When the work functionis large (when electrons are bound fast to the metal surface), the energy of the threshold photon must be large to produce aphotoelectron, and then the corresponding threshold frequency is large. Photons with frequencies larger than the thresholdfrequency fc always produce photoelectrons because they have Kmax > 0. Photons with frequencies smaller than fc
do not have enough energy to produce photoelectrons. Therefore, when incident radiation has a frequency below the cut-off frequency, the photoelectric effect is not observed. Because frequency f and wavelength λ of electromagnetic waves
are related by the fundamental relation λ f = c (where c is the speed of light in vacuum), the cut-off frequency has its
corresponding cut-off wavelength λc :


(6.16)λc = cfc = cϕ /h = hcϕ .
In this equation, hc = 1240 eV · nm. Our observations can be restated in the following equivalent way: When the incident
radiation has wavelengths longer than the cut-off wavelength, the photoelectric effect does not occur.
Example 6.5


Photoelectric Effect for Silver
Radiation with wavelength 300 nm is incident on a silver surface. Will photoelectrons be observed?


258 Chapter 6 | Photons and Matter Waves


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Strategy
Photoelectrons can be ejected from the metal surface only when the incident radiation has a shorter wavelengththan the cut-off wavelength. The work function of silver is ϕ = 4.73 eV (Table 6.1). To make the estimate, we
use Equation 6.16.
Solution
The threshold wavelength for observing the photoelectric effect in silver is


λc = hcϕ
= 1240 eV · nm


4.73 eV
= 262 nm.


The incident radiation has wavelength 300 nm, which is longer than the cut-off wavelength; therefore,photoelectrons are not observed.
Significance
If the photoelectrode were made of sodium instead of silver, the cut-off wavelength would be 504 nm andphotoelectrons would be observed.


Equation 6.14 in Einstein’s model tells us that the maximum kinetic energy of photoelectrons is a linear function of thefrequency of incident radiation, which is illustrated in Figure 6.10. For any metal, the slope of this plot has a value ofPlanck’s constant. The intercept with the Kmax -axis gives us a value of the work function that is characteristic for the metal.
On the other hand, Kmax can be directly measured in the experiment by measuring the value of the stopping potential ΔVs
(see Equation 6.12) at which the photocurrent stops. These direct measurements allow us to determine experimentally thevalue of Planck’s constant, as well as work functions of materials.
Einstein’s model also gives a straightforward explanation for the photocurrent values shown in Figure 6.9. For example,doubling the intensity of radiation translates to doubling the number of photons that strike the surface per unit time. Thelarger the number of photons, the larger is the number of photoelectrons, which leads to a larger photocurrent in thecircuit. This is how radiation intensity affects the photocurrent. The photocurrent must reach a plateau at some value ofpotential difference because, in unit time, the number of photoelectrons is equal to the number of incident photons and thenumber of incident photons does not depend on the applied potential difference at all, but only on the intensity of incidentradiation. The stopping potential does not change with the radiation intensity because the kinetic energy of photoelectrons(see Equation 6.14) does not depend on the radiation intensity.
Example 6.6


Work Function and Cut-Off Frequency
When a 180-nm light is used in an experiment with an unknown metal, the measured photocurrent drops to zeroat potential – 0.80 V. Determine the work function of the metal and its cut-off frequency for the photoelectriceffect.
Strategy
To find the cut-off frequency fc, we use Equation 6.15, but first we must find the work function ϕ. To find
ϕ, we use Equation 6.12 and Equation 6.14. Photocurrent drops to zero at the stopping value of potential,
so we identify ΔVs = 0.8V.
Solution
We use Equation 6.12 to find the kinetic energy of the photoelectrons:


Kmax = eΔVs = e(0.80V) = 0.80 eV.


Now we solve Equation 6.14 for ϕ :
ϕ = h f − Kmax = hcλ


− Kmax = 1240 eV · nm180 nm
− 0.80 eV = 6.09 eV.


Chapter 6 | Photons and Matter Waves 259




6.6


6.7


Finally, we use Equation 6.15 to find the cut-off frequency:
fc =


ϕ
h


= 6.09 eV
4.136 × 10−15 eV · s


= 1.47 × 10−15Hz.


Significance
In calculations like the one shown in this example, it is convenient to use Planck’s constant in the units of eV · s
and express all energies in eV instead of joules.


Example 6.7
The Photon Energy and Kinetic Energy of Photoelectrons
A 430-nm violet light is incident on a calcium photoelectrode with a work function of 2.71 eV.
Find the energy of the incident photons and the maximum kinetic energy of ejected electrons.
Strategy
The energy of the incident photon is E f = h f = hc /λ, where we use f λ = c. To obtain the maximum energy
of the ejected electrons, we use Equation 6.16.
Solution


E f =
hc
λ


= 1240 eV · nm
430 nm


= 2.88 eV, Kmax = E f − ϕ = 2.88 eV − 2.71 eV = 0.17 eV


Significance
In this experimental setup, photoelectrons stop flowing at the stopping potential of 0.17 V.


Check Your Understanding A yellow 589-nm light is incident on a surface whose work function is 1.20eV. What is the stopping potential? What is the cut-off wavelength?


Check Your Understanding Cut-off frequency for the photoelectric effect in some materials is
8.0 × 1013Hz. When the incident light has a frequency of 1.2 × 1014Hz, the stopping potential is measured
as – 0.16 V. Estimate a value of Planck’s constant from these data (in units J · s and eV · s ) and determine the
percentage error of your estimation.


6.3 | The Compton Effect
Learning Objectives


By the end of this section, you will be able to:
• Describe Compton’s experiment
• Explain the Compton wavelength shift
• Describe how experiments with X-rays confirm the particle nature of radiation


Two of Einstein’s influential ideas introduced in 1905 were the theory of special relativity and the concept of a lightquantum, which we now call a photon. Beyond 1905, Einstein went further to suggest that freely propagatingelectromagnetic waves consisted of photons that are particles of light in the same sense that electrons or other massiveparticles are particles of matter. A beam of monochromatic light of wavelength λ (or equivalently, of frequency f) can be
seen either as a classical wave or as a collection of photons that travel in a vacuum with one speed, c (the speed of light),


260 Chapter 6 | Photons and Matter Waves


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




and all carrying the same energy, E f = h f . This idea proved useful for explaining the interactions of light with particles
of matter.
Momentum of a Photon
Unlike a particle of matter that is characterized by its rest mass m0, a photon is massless. In a vacuum, unlike a particle
of matter that may vary its speed but cannot reach the speed of light, a photon travels at only one speed, which is exactlythe speed of light. From the point of view of Newtonian classical mechanics, these two characteristics imply that a photonshould not exist at all. For example, how can we find the linear momentum or kinetic energy of a body whose mass is zero?This apparent paradox vanishes if we describe a photon as a relativistic particle. According to the theory of special relativity,any particle in nature obeys the relativistic energy equation


(6.17)E2 = p2 c2 + m02 c4.


This relation can also be applied to a photon. In Equation 6.17, E is the total energy of a particle, p is its linear momentum,and m0 is its rest mass. For a photon, we simply set m0 = 0 in this equation. This leads to the expression for the
momentum p f of a photon


(6.18)
p f =


E f
c .


Here the photon’s energy E f is the same as that of a light quantum of frequency f, which we introduced to explain the
photoelectric effect:


(6.19)E f = h f = hcλ .


The wave relation that connects frequency f with wavelength λ and speed c also holds for photons:
(6.20)λ f = c


Therefore, a photon can be equivalently characterized by either its energy and wavelength, or its frequency and momentum.Equation 6.19 and Equation 6.20 can be combined into the explicit relation between a photon’s momentum and itswavelength:


(6.21)p f = hλ .


Notice that this equation gives us only the magnitude of the photon’s momentum and contains no information about thedirection in which the photon is moving. To include the direction, it is customary to write the photon’s momentum as avector:


(6.22)p→ f = ℏ k→ .


In Equation 6.22, ℏ = h /2π is the reduced Planck’s constant (pronounced “h-bar”), which is just Planck’s constant


Chapter 6 | Photons and Matter Waves 261




divided by the factor 2π. Vector k→ is called the “wave vector” or propagation vector (the direction in which a photon is
moving). The propagation vector shows the direction of the photon’s linear momentum vector. The magnitude of the wave
vector is k = | k→ | = 2π /λ and is called the wave number. Notice that this equation does not introduce any new physics.
We can verify that the magnitude of the vector in Equation 6.22 is the same as that given by Equation 6.18.
The Compton Effect
The Compton effect is the term used for an unusual result observed when X-rays are scattered on some materials. Byclassical theory, when an electromagnetic wave is scattered off atoms, the wavelength of the scattered radiation is expectedto be the same as the wavelength of the incident radiation. Contrary to this prediction of classical physics, observations showthat when X-rays are scattered off some materials, such as graphite, the scattered X-rays have different wavelengths fromthe wavelength of the incident X-rays. This classically unexplainable phenomenon was studied experimentally by ArthurH. Compton and his collaborators, and Compton gave its explanation in 1923.
To explain the shift in wavelengths measured in the experiment, Compton used Einstein’s idea of light as a particle. TheCompton effect has a very important place in the history of physics because it shows that electromagnetic radiation cannotbe explained as a purely wave phenomenon. The explanation of the Compton effect gave a convincing argument to thephysics community that electromagnetic waves can indeed behave like a stream of photons, which placed the concept of aphoton on firm ground.
The schematics of Compton’s experimental setup are shown in Figure 6.11. The idea of the experiment is straightforward:Monochromatic X-rays with wavelength λ are incident on a sample of graphite (the “target”), where they interact with
atoms inside the sample; they later emerge as scattered X-rays with wavelength λ′. A detector placed behind the target can
measure the intensity of radiation scattered in any direction θ with respect to the direction of the incident X-ray beam. This
scattering angle, θ, is the angle between the direction of the scattered beam and the direction of the incident beam. In
this experiment, we know the intensity and the wavelength λ of the incoming (incident) beam; and for a given scattering
angle θ, we measure the intensity and the wavelength λ′ of the outgoing (scattered) beam. Typical results of these
measurements are shown in Figure 6.12, where the x-axis is the wavelength of the scattered X-rays and the y-axis is theintensity of the scattered X-rays, measured for different scattering angles (indicated on the graphs). For all scattering angles(except for θ = 0°), we measure two intensity peaks. One peak is located at the wavelength λ, which is the wavelength
of the incident beam. The other peak is located at some other wavelength, λ′. The two peaks are separated by Δλ, which
depends on the scattering angle θ of the outgoing beam (in the direction of observation). The separation Δλ is called the
Compton shift.


Figure 6.11 Experimental setup for studying Comptonscattering.


262 Chapter 6 | Photons and Matter Waves


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 6.12 Experimental data show the Compton effect for X-rays scattering off graphite at various angles: The intensity ofthe scattered beam has two peaks. One peak appears at the wavelength λ of the incident radiation and the second peak appears
at wavelength λ′. The separation Δλ between the peaks depends on the scattering angle θ, which is the angular position of
the detector in Figure 6.11. The experimental data in this figure are plotted in arbitrary units so that the height of the profilereflects the intensity of the scattered beam above background noise.
Compton Shift
As given by Compton, the explanation of the Compton shift is that in the target material, graphite, valence electrons areloosely bound in the atoms and behave like free electrons. Compton assumed that the incident X-ray radiation is a streamof photons. An incoming photon in this stream collides with a valence electron in the graphite target. In the course of thiscollision, the incoming photon transfers some part of its energy and momentum to the target electron and leaves the sceneas a scattered photon. This model explains in qualitative terms why the scattered radiation has a longer wavelength thanthe incident radiation. Put simply, a photon that has lost some of its energy emerges as a photon with a lower frequency, orequivalently, with a longer wavelength. To show that his model was correct, Compton used it to derive the expression forthe Compton shift. In his derivation, he assumed that both photon and electron are relativistic particles and that the collisionobeys two commonsense principles: (1) the conservation of linear momentum and (2) the conservation of total relativisticenergy.
In the following derivation of the Compton shift, E f and p→ f denote the energy and momentum, respectively, of an
incident photon with frequency f. The photon collides with a relativistic electron at rest, which means that immediately
before the collision, the electron’s energy is entirely its rest mass energy, m0 c2. Immediately after the collision, the
electron has energy E and momentum p→ , both of which satisfy Equation 6.19. Immediately after the collision, the
outgoing photon has energy Ẽ f , momentum p̃→ f , and frequency f ′. The direction of the incident photon is horizontal
from left to right, and the direction of the outgoing photon is at the angle θ, as illustrated in Figure 6.11. The scattering
angle θ is the angle between the momentum vectors p→ f and p̃→ f , and we can write their scalar product:


(6.23)p→ f · p̃→ f = p f p̃ f cosθ.
Following Compton’s argument, we assume that the colliding photon and electron form an isolated system. This assumptionis valid for weakly bound electrons that, to a good approximation, can be treated as free particles. Our first equation is theconservation of energy for the photon-electron system:


(6.24)E f + m0 c2 = Ẽ f + E.
The left side of this equation is the energy of the system at the instant immediately before the collision, and the rightside of the equation is the energy of the system at the instant immediately after the collision. Our second equation is theconservation of linear momentum for the photon–electron system where the electron is at rest at the instant immediatelybefore the collision:


(6.25)p→ f = p̃→ f + p→ .


Chapter 6 | Photons and Matter Waves 263




The left side of this equation is the momentum of the system right before the collision, and the right side of the equationis the momentum of the system right after collision. The entire physics of Compton scattering is contained in these threepreceding equations––the remaining part is algebra. At this point, we could jump to the concluding formula for the Comptonshift, but it is beneficial to highlight the main algebraic steps that lead to Compton’s formula, which we give here as follows.
We start with rearranging the terms in Equation 6.24 and squaring it:





⎝E f − Ẽ f



⎠+ m0 c


2⎤


2
= E2.


In the next step, we substitute Equation 6.19 for E2, simplify, and divide both sides by c2 to obtain

⎝E f /c − Ẽ f /c





2
+ 2m0 c



⎝E f /c − Ẽ f /c



⎠ = p


2.


Now we can use Equation 6.21 to express this form of the energy equation in terms of momenta. The result is
(6.26)⎛


⎝p f − p̃ f



2
+ 2m0 c



⎝p f − p̃ f



⎠ = p


2.


To eliminate p2, we turn to the momentum equation Equation 6.25, rearrange its terms, and square it to obtain

⎝ p


f − p̃


f



2


= p2 and

⎝ p


f − p̃


f



2


= p f
2 + p̃ f


2 − 2 p→ f · p̃


f .


The product of the momentum vectors is given by Equation 6.23. When we substitute this result for p2 in Equation
6.26, we obtain the energy equation that contains the scattering angle θ :



⎝p f − p̃ f





2
+ 2m0 c



⎝p f − p̃ f



⎠ = p f


2 + p̃ f
2 − 2p f p̃ f cosθ.


With further algebra, this result can be simplified to
(6.27)1


p̃ f
− 1p f


= 1m0 c
(1 − cosθ).


Now recall Equation 6.21 and write: 1/ p̃ f = λ′ /h and 1/ p f = λ /h. When these relations are substituted into
Equation 6.27, we obtain the relation for the Compton shift:


(6.28)λ′ − λ = hm0 c(1 − cosθ).
The factor h /m0 c is called the Compton wavelength of the electron:


(6.29)λc = hm0 c = 0.00243 nm = 2.43 pm.


Denoting the shift as Δλ = λ′ − λ, the concluding result can be rewritten as


(6.30)Δλ = λc(1 − cosθ).


This formula for the Compton shift describes outstandingly well the experimental results shown in Figure 6.12. Scatteringdata measured for molybdenum, graphite, calcite, and many other target materials are in accord with this theoretical result.The nonshifted peak shown in Figure 6.12 is due to photon collisions with tightly bound inner electrons in the targetmaterial. Photons that collide with the inner electrons of the target atoms in fact collide with the entire atom. In this extremecase, the rest mass in Equation 6.29 must be changed to the rest mass of the atom. This type of shift is four orders ofmagnitude smaller than the shift caused by collisions with electrons and is so small that it can be neglected.


264 Chapter 6 | Photons and Matter Waves


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




6.8


Compton scattering is an example of inelastic scattering, in which the scattered radiation has a longer wavelength than thewavelength of the incident radiation. In today’s usage, the term “Compton scattering” is used for the inelastic scattering ofphotons by free, charged particles. In Compton scattering, treating photons as particles with momenta that can be transferredto charged particles provides the theoretical background to explain the wavelength shifts measured in experiments; this isthe evidence that radiation consists of photons.
Example 6.8


Compton Scattering
An incident 71-pm X-ray is incident on a calcite target. Find the wavelength of the X-ray scattered at a 30°
angle. What is the largest shift that can be expected in this experiment?
Strategy
To find the wavelength of the scattered X-ray, first we must find the Compton shift for the given scattering angle,
θ = 30°. We use Equation 6.30. Then we add this shift to the incident wavelength to obtain the scattered
wavelength. The largest Compton shift occurs at the angle θ when 1 − cosθ has the largest value, which is for
the angle θ = 180°.
Solution
The shift at θ = 30° is


Δλ = λc(1 − cos30°) = 0.134λc = (0.134)(2.43) pm = 0.325 pm.


This gives the scattered wavelength:
λ′ = λ + Δλ = (71 + 0.325) pm = 71.325 pm.


The largest shift is
(Δλ)max = λc(1 − cos180


0) = 2(2.43 pm) = 4.86 pm.


Significance
The largest shift in wavelength is detected for the backscattered radiation; however, most of the photons from theincident beam pass through the target and only a small fraction of photons gets backscattered (typically, less than5%). Therefore, these measurements require highly sensitive detectors.


Check Your Understanding An incident 71-pm X-ray is incident on a calcite target. Find thewavelength of the X-ray scattered at a 60° angle. What is the smallest shift that can be expected in this
experiment?


6.4 | Bohr’s Model of the Hydrogen Atom
Learning Objectives


By the end of this section, you will be able to:
• Explain the difference between the absorption spectrum and the emission spectrum of radiationemitted by atoms
• Describe the Rutherford gold foil experiment and the discovery of the atomic nucleus
• Explain the atomic structure of hydrogen
• Describe the postulates of the early quantum theory for the hydrogen atom
• Summarize how Bohr’s quantum model of the hydrogen atom explains the radiation spectrumof atomic hydrogen


Historically, Bohr’s model of the hydrogen atom is the very first model of atomic structure that correctly explained the


Chapter 6 | Photons and Matter Waves 265




radiation spectra of atomic hydrogen. The model has a special place in the history of physics because it introduced an earlyquantum theory, which brought about new developments in scientific thought and later culminated in the development ofquantum mechanics. To understand the specifics of Bohr’s model, we must first review the nineteenth-century discoveriesthat prompted its formulation.
When we use a prism to analyze white light coming from the sun, several dark lines in the solar spectrum are observed(Figure 6.13). Solar absorption lines are called Fraunhofer lines after Joseph von Fraunhofer, who accurately measuredtheir wavelengths. During 1854–1861, Gustav Kirchhoff and Robert Bunsen discovered that for the various chemicalelements, the line emission spectrum of an element exactly matches its line absorption spectrum. The difference betweenthe absorption spectrum and the emission spectrum is explained in Figure 6.14. An absorption spectrum is observed whenlight passes through a gas. This spectrum appears as black lines that occur only at certain wavelengths on the background ofthe continuous spectrum of white light (Figure 6.13). The missing wavelengths tell us which wavelengths of the radiationare absorbed by the gas. The emission spectrum is observed when light is emitted by a gas. This spectrum is seen ascolorful lines on the black background (see Figure 6.15 and Figure 6.16). Positions of the emission lines tell us whichwavelengths of the radiation are emitted by the gas. Each chemical element has its own characteristic emission spectrum.For each element, the positions of its emission lines are exactly the same as the positions of its absorption lines. Thismeans that atoms of a specific element absorb radiation only at specific wavelengths and radiation that does not have thesewavelengths is not absorbed by the element at all. This also means that the radiation emitted by atoms of each element hasexactly the same wavelengths as the radiation they absorb.


Figure 6.13 In the solar emission spectrum in the visible range from 380 nm to 710 nm, Fraunhofer lines are observed asvertical black lines at specific spectral positions in the continuous spectrum. Highly sensitive modern instruments observethousands of such lines.


Figure 6.14 Observation of line spectra: (a) setup to observe absorption lines; (b) setup toobserve emission lines. (a) White light passes through a cold gas that is contained in a glassflask. A prism is used to separate wavelengths of the passed light. In the spectrum of the passedlight, some wavelengths are missing, which are seen as black absorption lines in the continuousspectrum on the viewing screen. (b) A gas is contained in a glass discharge tube that haselectrodes at its ends. At a high potential difference between the electrodes, the gas glows andthe light emitted from the gas passes through the prism that separates its wavelengths. In thespectrum of the emitted light, only specific wavelengths are present, which are seen as colorfulemission lines on the screen.


266 Chapter 6 | Photons and Matter Waves


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 6.15 The emission spectrum of atomic hydrogen: The spectral positions of emission lines are characteristic forhydrogen atoms. (credit: “Merikanto”/Wikimedia Commons)


Figure 6.16 The emission spectrum of atomic iron: The spectral positions of emission lines are characteristic for iron atoms.


Emission spectra of the elements have complex structures; they become even more complex for elements with higher atomicnumbers. The simplest spectrum, shown in Figure 6.15, belongs to the hydrogen atom. Only four lines are visible to thehuman eye. As you read from right to left in Figure 6.15, these lines are: red (656 nm), called the H- α line; aqua (486
nm), blue (434 nm), and violet (410 nm). The lines with wavelengths shorter than 400 nm appear in the ultraviolet part ofthe spectrum (Figure 6.15, far left) and are invisible to the human eye. There are infinitely many invisible spectral lines inthe series for hydrogen.
An empirical formula to describe the positions (wavelengths) λ of the hydrogen emission lines in this series was discovered
in 1885 by Johann Balmer. It is known as the Balmer formula:


(6.31)1
λ
= RH


1
22


− 1
n2

⎠.


The constant RH = 1.09737 × 107m−1 is called the Rydberg constant for hydrogen. In Equation 6.31, the positive
integer n takes on values n = 3, 4, 5, 6 for the four visible lines in this series. The series of emission lines given by the
Balmer formula is called the Balmer series for hydrogen. Other emission lines of hydrogen that were discovered in thetwentieth century are described by the Rydberg formula, which summarizes all of the experimental data:


(6.32)
1
λ
= RH



⎜ 1
n f
2
− 1
ni
2





⎟, where ni = n f + 1, n f + 2, n f + 3, ...


When n f = 1, the series of spectral lines is called the Lyman series. When n f = 2, the series is called the Balmer
series, and in this case, the Rydberg formula coincides with the Balmer formula. When n f = 3, the series is called the
Paschen series. When n f = 4, the series is called the Brackett series. When n f = 5, the series is called the Pfund
series. When n f = 6, we have the Humphreys series. As you may guess, there are infinitely many such spectral bands
in the spectrum of hydrogen because n f can be any positive integer number.
The Rydberg formula for hydrogen gives the exact positions of the spectral lines as they are observed in a laboratory;however, at the beginning of the twentieth century, nobody could explain why it worked so well. The Rydberg formularemained unexplained until the first successful model of the hydrogen atom was proposed in 1913.


Chapter 6 | Photons and Matter Waves 267




6.9


Example 6.9
Limits of the Balmer Series
Calculate the longest and the shortest wavelengths in the Balmer series.
Strategy
We can use either the Balmer formula or the Rydberg formula. The longest wavelength is obtained when 1/ni
is largest, which is when ni = n f + 1 = 3, because n f = 2 for the Balmer series. The smallest wavelength is
obtained when 1/ni is smallest, which is 1/ni → 0 when ni → ∞.
Solution
The long-wave limit:


1
λ
= RH


1
22


− 1
32

⎠ = (1.09737 × 10


7) 1m


1
4
− 1


9

⎠⇒ λ = 656.3 nm


The short-wave limit:
1
λ
= RH


1
22


− 0

⎠ = (1.09737 × 10


7) 1m


1
4

⎠⇒ λ = 364.6 nm


Significance
Note that there are infinitely many spectral lines lying between these two limits.


Check Your Understanding What are the limits of the Lyman series? Can you see these spectral lines?


The key to unlocking the mystery of atomic spectra is in understanding atomic structure. Scientists have long known thatmatter is made of atoms. According to nineteenth-century science, atoms are the smallest indivisible quantities of matter.This scientific belief was shattered by a series of groundbreaking experiments that proved the existence of subatomicparticles, such as electrons, protons, and neutrons.
The electron was discovered and identified as the smallest quantity of electric charge by J.J. Thomson in 1897 in hiscathode ray experiments, also known as β-ray experiments: A β-ray is a beam of electrons. In 1904, Thomson proposedthe first model of atomic structure, known as the “plum pudding” model, in which an atom consisted of an unknownpositively charged matter with negative electrons embedded in it like plums in a pudding. Around 1900, E. Rutherford,and independently, Paul Ulrich Villard, classified all radiation known at that time as α -rays, β-rays, and γ-rays (a γ-ray
is a beam of highly energetic photons). In 1907, Rutherford and Thomas Royds used spectroscopy methods to show thatpositively charged particles of α -radiation (called α -particles) are in fact doubly ionized atoms of helium. In 1909,
Rutherford, Ernest Marsden, and Hans Geiger used α -particles in their famous scattering experiment that disproved
Thomson’s model (see Linear Momentum and Collisions (http://cnx.org/content/m58317/latest/) ).
In the Rutherford gold foil experiment (also known as the Geiger–Marsden experiment), α -particles were incident on
a thin gold foil and were scattered by gold atoms inside the foil (see Types of Collisions (http://cnx.org/content/m58321/latest/#CNX_UPhysics_09_04_TvsR) ). The outgoing particles were detected by a 360° scintillation screen
surrounding the gold target (for a detailed description of the experimental setup, see Linear Momentum and Collisions(http://cnx.org/content/m58317/latest/) ). When a scattered particle struck the screen, a tiny flash of light(scintillation) was observed at that location. By counting the scintillations seen at various angles with respect to the directionof the incident beam, the scientists could determine what fraction of the incident particles were scattered and what fractionwere not deflected at all. If the plum pudding model were correct, there would be no back-scattered α -particles. However,
the results of the Rutherford experiment showed that, although a sizable fraction of α -particles emerged from the foil not
scattered at all as though the foil were not in their way, a significant fraction of α -particles were back-scattered toward the
source. This kind of result was possible only when most of the mass and the entire positive charge of the gold atom wereconcentrated in a tiny space inside the atom.
In 1911, Rutherford proposed a nuclear model of the atom. In Rutherford’s model, an atom contained a positively charged


268 Chapter 6 | Photons and Matter Waves


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




nucleus of negligible size, almost like a point, but included almost the entire mass of the atom. The atom also containednegative electrons that were located within the atom but relatively far away from the nucleus. Ten years later, Rutherfordcoined the name proton for the nucleus of hydrogen and the name neutron for a hypothetical electrically neutral particlethat would mediate the binding of positive protons in the nucleus (the neutron was discovered in 1932 by James Chadwick).Rutherford is credited with the discovery of the atomic nucleus; however, the Rutherford model of atomic structure doesnot explain the Rydberg formula for the hydrogen emission lines.
Bohr’s model of the hydrogen atom, proposed by Niels Bohr in 1913, was the first quantum model that correctly explainedthe hydrogen emission spectrum. Bohr’s model combines the classical mechanics of planetary motion with the quantumconcept of photons. Once Rutherford had established the existence of the atomic nucleus, Bohr’s intuition that the negativeelectron in the hydrogen atom must revolve around the positive nucleus became a logical consequence of the inverse-square-distance law of electrostatic attraction. Recall that Coulomb’s law describing the attraction between two opposite chargeshas a similar form to Newton’s universal law of gravitation in the sense that the gravitational force and the electrostatic
force are both decreasing as 1/r2, where r is the separation distance between the bodies. In the same way as Earth
revolves around the sun, the negative electron in the hydrogen atom can revolve around the positive nucleus. However,an accelerating charge radiates its energy. Classically, if the electron moved around the nucleus in a planetary fashion, itwould be undergoing centripetal acceleration, and thus would be radiating energy that would cause it to spiral down into thenucleus. Such a planetary hydrogen atom would not be stable, which is contrary to what we know about ordinary hydrogenatoms that do not disintegrate. Moreover, the classical motion of the electron is not able to explain the discrete emissionspectrum of hydrogen.
To circumvent these two difficulties, Bohr proposed the following three postulates of Bohr’s model:


1. The negative electron moves around the positive nucleus (proton) in a circular orbit. All electron orbits are centeredat the nucleus. Not all classically possible orbits are available to an electron bound to the nucleus.
2. The allowed electron orbits satisfy the first quantization condition: In the nth orbit, the angular momentum Ln of


the electron can take only discrete values:


(6.33)Ln = nℏ, where n = 1, 2, 3, ...


This postulate says that the electron’s angular momentum is quantized. Denoted by rn and vn, respectively, the
radius of the nth orbit and the electron’s speed in it, the first quantization condition can be expressed explicitly as


(6.34)me vn rn = nℏ.
3. An electron is allowed to make transitions from one orbit where its energy is En to another orbit where its energy is


Em. When an atom absorbs a photon, the electron makes a transition to a higher-energy orbit. When an atom emits
a photon, the electron transits to a lower-energy orbit. Electron transitions with the simultaneous photon absorptionor photon emission take place instantaneously. The allowed electron transitions satisfy the second quantizationcondition:


(6.35)h f = |En − Em|


where h f is the energy of either an emitted or an absorbed photon with frequency f. The second quantization
condition states that an electron’s change in energy in the hydrogen atom is quantized.


These three postulates of the early quantum theory of the hydrogen atom allow us to derive not only the Rydberg formula,but also the value of the Rydberg constant and other important properties of the hydrogen atom such as its energy levels,its ionization energy, and the sizes of electron orbits. Note that in Bohr’s model, along with two nonclassical quantizationpostulates, we also have the classical description of the electron as a particle that is subjected to the Coulomb force, and itsmotion must obey Newton’s laws of motion. The hydrogen atom, as an isolated system, must obey the laws of conservationof energy and momentum in the way we know from classical physics. Having this theoretical framework in mind, we areready to proceed with our analysis.


Chapter 6 | Photons and Matter Waves 269




Electron Orbits
To obtain the size rn of the electron’s nth orbit and the electron’s speed vn in it, we turn to Newtonian mechanics. As
a charged particle, the electron experiences an electrostatic pull toward the positively charged nucleus in the center of itscircular orbit. This electrostatic pull is the centripetal force that causes the electron to move in a circle around the nucleus.Therefore, the magnitude of centripetal force is identified with the magnitude of the electrostatic force:


(6.36)me vn2
rn =


1
4πε0


e2


rn
2
.


Here, e denotes the value of the elementary charge. The negative electron and positive proton have the same value of
charge, |q| = e. When Equation 6.36 is combined with the first quantization condition given by Equation 6.34, we can
solve for the speed, vn, and for the radius, rn :


(6.37)
vn = 14πε0


e2


1
n


(6.38)
rn = 4πε0



2


me e
2
n2.


Note that these results tell us that the electron’s speed as well as the radius of its orbit depend only on the index nthat enumerates the orbit because all other quantities in the preceding equations are fundamental constants. We see fromEquation 6.38 that the size of the orbit grows as the square of n. This means that the second orbit is four times as large asthe first orbit, and the third orbit is nine times as large as the first orbit, and so on. We also see from Equation 6.37 that theelectron’s speed in the orbit decreases as the orbit size increases. The electron’s speed is largest in the first Bohr orbit, for
n = 1, which is the orbit closest to the nucleus. The radius of the first Bohr orbit is called the Bohr radius of hydrogen,
denoted as a0. Its value is obtained by setting n = 1 in Equation 6.38:


(6.39)
a0 = 4πε0



2


me e
2
= 5.29 × 10−11m = 0.529 Å.


We can substitute a0 in Equation 6.38 to express the radius of the nth orbit in terms of a0 :


(6.40)rn = a0n2.


This result means that the electron orbits in hydrogen atom are quantized because the orbital radius takes on only specificvalues of a0, 4a0, 9a0, 16a0, ... given by Equation 6.40, and no other values are allowed.
Electron Energies
The total energy En of an electron in the nth orbit is the sum of its kinetic energy Kn and its electrostatic potential energy
Un. Utilizing Equation 6.37, we find that


(6.41)
Kn = 12


me vn
2 = 1


32π2 ε0
2
me e


4



2


1
n2


.


Recall that the electrostatic potential energy of interaction between two charges q1 and q2 that are separated by a distance
r12 is (1 /4πε0)q1q2 /r12. Here, q1 = + e is the charge of the nucleus in the hydrogen atom (the charge of the proton),
q2 = −e is the charge of the electron and r12 = rn is the radius of the nth orbit. Now we use Equation 6.38 to find the
potential energy of the electron:


270 Chapter 6 | Photons and Matter Waves


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




(6.42)
Un = − 14πε0


e2
rn = −


1
16π2 ε0


2
me e


4



2


1
n2


.


The total energy of the electron is the sum of Equation 6.41 and Equation 6.42:
(6.43)


En = Kn + Un = − 1
32π2 ε0


2
me e


4



2


1
n2


.


Note that the energy depends only on the index n because the remaining symbols in Equation 6.43 are physical constants.The value of the constant factor in Equation 6.43 is


(6.44)
E0 =


1
32π2 ε0


2
me e


4



2


= 1
8ε0


2
me e


4


h2
= 2.17 × 10−18 J = 13.6 eV.


It is convenient to express the electron’s energy in the nth orbit in terms of this energy, as


(6.45)En = −E0 1
n2


.


Now we can see that the electron energies in the hydrogen atom are quantized because they can have only discrete values of
−E0, −E0 /4, −E0 /9, −E0 /16, ... given by Equation 6.45, and no other energy values are allowed. This set of allowed
electron energies is called the energy spectrum of hydrogen (Figure 6.17). The index n that enumerates energy levelsin Bohr’s model is called the energy quantum number. We identify the energy of the electron inside the hydrogen atomwith the energy of the hydrogen atom. Note that the smallest value of energy is obtained for n = 1, so the hydrogen atom
cannot have energy smaller than that. This smallest value of the electron energy in the hydrogen atom is called the groundstate energy of the hydrogen atom and its value is


(6.46)E1 = −E0 = −13.6 eV.


The hydrogen atom may have other energies that are higher than the ground state. These higher energy states are known asexcited energy states of a hydrogen atom.
There is only one ground state, but there are infinitely many excited states because there are infinitely many values of nin Equation 6.45. We say that the electron is in the “first exited state” when its energy is E2 (when n = 2 ), the second
excited state when its energy is E3 (when n = 3 ) and, in general, in the nth exited state when its energy is En + 1. There
is no highest-of-all excited state; however, there is a limit to the sequence of excited states. If we keep increasing n in
Equation 6.45, we find that the limit is − lim


n → ∞
E0 /n


2 = 0. In this limit, the electron is no longer bound to the nucleus
but becomes a free electron. An electron remains bound in the hydrogen atom as long as its energy is negative. An electronthat orbits the nucleus in the first Bohr orbit, closest to the nucleus, is in the ground state, where its energy has the smallestvalue. In the ground state, the electron is most strongly bound to the nucleus and its energy is given by Equation 6.46.If we want to remove this electron from the atom, we must supply it with enough energy, E∞, to at least balance out its
ground state energy E1 :


(6.47)E∞ + E1 = 0 ⇒ E∞ = −E1 = −(−E0) = E0 = 13.6 eV.
The energy that is needed to remove the electron from the atom is called the ionization energy. The ionization energy E∞
that is needed to remove the electron from the first Bohr orbit is called the ionization limit of the hydrogen atom. Theionization limit in Equation 6.47 that we obtain in Bohr’s model agrees with experimental value.


Chapter 6 | Photons and Matter Waves 271




Figure 6.17 The energy spectrum of the hydrogen atom.Energy levels (horizontal lines) represent the bound states of anelectron in the atom. There is only one ground state, n = 1,
and infinite quantized excited states. The states are enumeratedby the quantum number n = 1, 2, 3, 4, .... Vertical lines
illustrate the allowed electron transitions between the states.Downward arrows illustrate transitions with an emission of aphoton with a wavelength in the indicated spectral band.


Spectral Emission Lines of Hydrogen
To obtain the wavelengths of the emitted radiation when an electron makes a transition from the nth orbit to the mth orbit,we use the second of Bohr’s quantization conditions and Equation 6.45 for energies. The emission of energy from theatom can occur only when an electron makes a transition from an excited state to a lower-energy state. In the course of sucha transition, the emitted photon carries away the difference of energies between the states involved in the transition. Thetransition cannot go in the other direction because the energy of a photon cannot be negative, which means that for emissionwe must have En > Em and n > m. Therefore, the third of Bohr’s postulates gives


(6.48)
h f = |En − Em| = En − Em = −E0 1n2 + E0 1m2 = E0⎛⎝ 1m2 − 1n2⎞⎠.


Now we express the photon’s energy in terms of its wavelength, h f = hc /λ, and divide both sides of Equation 6.48 by
hc. The result is


272 Chapter 6 | Photons and Matter Waves


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




(6.49)1
λ
=


E0
hc


1
m2


− 1
n2

⎠.


The value of the constant in this equation is
(6.50)E0


hc
= 13.6 eV


(4.136 × 10−15 eV · s)(2.997 × 108m/s)
= 1.097 × 107 1m.


This value is exactly the Rydberg constant RH in the Rydberg heuristic formula Equation 6.32. In fact, Equation 6.49
is identical to the Rydberg formula, because for a given m, we have n = m + 1, m + 2, .... In this way, the Bohr quantum
model of the hydrogen atom allows us to derive the experimental Rydberg constant from first principles and to express it interms of fundamental constants. Transitions between the allowed electron orbits are illustrated in Figure 6.17.
We can repeat the same steps that led to Equation 6.49 to obtain the wavelength of the absorbed radiation; this againgives Equation 6.49 but this time for the positions of absorption lines in the absorption spectrum of hydrogen. Theonly difference is that for absorption, the quantum number m is the index of the orbit occupied by the electron before thetransition (lower-energy orbit) and the quantum number n is the index of the orbit to which the electron makes the transition(higher-energy orbit). The difference between the electron energies in these two orbits is the energy of the absorbed photon.
Example 6.10


Size and Ionization Energy of the Hydrogen Atom in an Excited State
If a hydrogen atom in the ground state absorbs a 93.7-nm photon, corresponding to a transition line in the Lymanseries, how does this affect the atom’s energy and size? How much energy is needed to ionize the atom when it isin this excited state? Give your answers in absolute units, and relative to the ground state.
Strategy
Before the absorption, the atom is in its ground state. This means that the electron transition takes place from theorbit m = 1 to some higher nth orbit. First, we must determine n for the absorbed wavelength λ = 93.7 nm.
Then, we can use Equation 6.45 to find the energy En of the excited state and its ionization energy E∞, n,
and use Equation 6.40 to find the radius rn of the atom in the excited state. To estimate n, we use Equation
6.49.
Solution
Substitute m = 1 and λ = 93.7 nm in Equation 6.49 and solve for n. You should not expect to obtain a
perfect integer answer because of rounding errors, but your answer will be close to an integer, and you canestimate n by taking the integral part of your answer:


1
λ
= RH


1
12


− 1
n2

⎠⇒ n =


1
1 − 1


λRH


= 1
1 − 1


(93.7 × 10−9m)(1.097 × 107m−1)


= 6.07 ⇒ n = 6.


The radius of the n = 6 orbit is
rn = a0n


2 = a06
2 = 36a0 = 36(0.529 × 10


−10m) = 19.04 × 10−10m ≅ 19.0 Å.


Thus, after absorbing the 93.7-nm photon, the size of the hydrogen atom in the excited n = 6 state is 36 times
larger than before the absorption, when the atom was in the ground state. The energy of the fifth excited state (
n = 6 ) is:


En = −
E0
n2


= −
E0
62


= −
E0
36


= − 13.6 eV
36


≅ − 0.378 eV.


After absorbing the 93.7-nm photon, the energy of the hydrogen atom is larger than it was before the absorption.Ionization of the atom when it is in the fifth excited state ( n = 6 ) requites 36 times less energy than is needed
when the atom is in the ground state:


E∞, 6 = −E6 = −(−0.378 eV) = 0.378 eV.


Chapter 6 | Photons and Matter Waves 273




6.10


Significance
We can analyze any spectral line in the spectrum of hydrogen in the same way. Thus, the experimentalmeasurements of spectral lines provide us with information about the atomic structure of the hydrogen atom.


Check Your Understanding When an electron in a hydrogen atom is in the first excited state, whatprediction does the Bohr model give about its orbital speed and kinetic energy? What is the magnitude of itsorbital angular momentum?


Bohr’s model of the hydrogen atom also correctly predicts the spectra of some hydrogen-like ions. Hydrogen-like ionsare atoms of elements with an atomic number Z larger than one ( Z = 1 for hydrogen) but with all electrons removed
except one. For example, an electrically neutral helium atom has an atomic number Z = 2. This means it has two electrons
orbiting the nucleus with a charge of q = + Ze. When one of the orbiting electrons is removed from the helium atom (we
say, when the helium atom is singly ionized), what remains is a hydrogen-like atomic structure where the remaining electronorbits the nucleus with a charge of q = + Ze. This type of situation is described by the Bohr model. Assuming that the
charge of the nucleus is not +e but +Ze, we can repeat all steps, beginning with Equation 6.36, to obtain the results
for a hydrogen-like ion:


(6.51)rn = a0Z n2


where a0 is the Bohr orbit of hydrogen, and


(6.52)En = −Z2E0 1
n2


where E0 is the ionization limit of a hydrogen atom. These equations are good approximations as long as the atomic
number Z is not too large.
The Bohr model is important because it was the first model to postulate the quantization of electron orbits in atoms. Thus,it represents an early quantum theory that gave a start to developing modern quantum theory. It introduced the concept ofa quantum number to describe atomic states. The limitation of the early quantum theory is that it cannot describe atomsin which the number of electrons orbiting the nucleus is larger than one. The Bohr model of hydrogen is a semi-classicalmodel because it combines the classical concept of electron orbits with the new concept of quantization. The remarkablesuccess of this model prompted many physicists to seek an explanation for why such a model should work at all, and toseek an understanding of the physics behind the postulates of early quantum theory. This search brought about the onset ofan entirely new concept of “matter waves.”


274 Chapter 6 | Photons and Matter Waves


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




6.5 | De Broglie’s Matter Waves
Learning Objectives


By the end of this section, you will be able to:
• Describe de Broglie’s hypothesis of matter waves
• Explain how the de Broglie’s hypothesis gives the rationale for the quantization of angularmomentum in Bohr’s quantum theory of the hydrogen atom
• Describe the Davisson–Germer experiment
• Interpret de Broglie’s idea of matter waves and how they account for electron diffractionphenomena


Compton’s formula established that an electromagnetic wave can behave like a particle of light when interacting withmatter. In 1924, Louis de Broglie proposed a new speculative hypothesis that electrons and other particles of mattercan behave like waves. Today, this idea is known as de Broglie’s hypothesis of matter waves. In 1926, De Broglie’shypothesis, together with Bohr’s early quantum theory, led to the development of a new theory of wave quantummechanics to describe the physics of atoms and subatomic particles. Quantum mechanics has paved the way for newengineering inventions and technologies, such as the laser and magnetic resonance imaging (MRI). These new technologiesdrive discoveries in other sciences such as biology and chemistry.
According to de Broglie’s hypothesis, massless photons as well as massive particles must satisfy one common set ofrelations that connect the energy E with the frequency f, and the linear momentum p with the wavelength λ. We have
discussed these relations for photons in the context of Compton’s effect. We are recalling them now in a more generalcontext. Any particle that has energy and momentum is a de Broglie wave of frequency f and wavelength λ :


(6.53)E = h f


(6.54)λ = hp.


Here, E and p are, respectively, the relativistic energy and the momentum of a particle. De Broglie’s relations are usually
expressed in terms of the wave vector k→ , k = 2π /λ, and the wave frequency ω = 2π f , as we usually do for waves:


(6.55)E = ℏω
(6.56)p→ = ℏ k→ .


Wave theory tells us that a wave carries its energy with the group velocity. For matter waves, this group velocity is the
velocity u of the particle. Identifying the energy E and momentum p of a particle with its relativistic energy mc2 and its
relativistic momentum mu, respectively, it follows from de Broglie relations that matter waves satisfy the following relation:


(6.57)
λ f = ω


k
= E /ℏ


p /ℏ
= Ep =


mc2
mu =


c2
u =


c
β


where β = u /c. When a particle is massless we have u = c and Equation 6.57 becomes λ f = c.


Chapter 6 | Photons and Matter Waves 275




Example 6.11
How Long Are de Broglie Matter Waves?
Calculate the de Broglie wavelength of: (a) a 0.65-kg basketball thrown at a speed of 10 m/s, (b) a nonrelativisticelectron with a kinetic energy of 1.0 eV, and (c) a relativistic electron with a kinetic energy of 108 keV.
Strategy
We use Equation 6.57 to find the de Broglie wavelength. When the problem involves a nonrelativistic objectmoving with a nonrelativistic speed u, such as in (a) when β = u /c ≪ 1, we use nonrelativistic momentum p.
When the nonrelativistic approximation cannot be used, such as in (c), we must use the relativistic momentum
p = mu = m0 γu = E0 γβ, where the rest mass energy of a particle is E0 = m0 c2 and γ is the Lorentz
factor γ = 1/ 1 − β2. The total energy E of a particle is given by Equation 6.53 and the kinetic energy
is K = E − E0 = (γ − 1)E0. When the kinetic energy is known, we can invert Equation 6.18 to find the
momentum p = (E2 − E02) /c2 = K(K + 2E0) /c and substitute in Equation 6.57 to obtain


(6.58)λ = hp = hcK(K + 2E0).
Depending on the problem at hand, in this equation we can use the following values for hc:
hc = (6.626 × 10−34 J · s)(2.998 × 108m/s) = 1.986 × 10−25 J · m = 1.241 eV · µm


Solutiona. For the basketball, the kinetic energy is
K = m0u


2 /2 = (0.65kg)(10m/s)2 /2 = 32.5J


and the rest mass energy is
E0 = m0 c


2 = (0.65kg)(2.998 × 108m/s)2 = 5.84 × 1016 J.


We see that K / (K + E0) ≪ 1 and use p = m0u = (0.65kg)(10m/s) = 6.5 J · s/m :
λ = hp =


6.626 × 10−34 J · s
6.5J · s/m


= 1.02 × 10−34m.


b. For the nonrelativistic electron,
E0 = m0 c


2 = (9.109 × 10−31kg)(2.998 × 108m/s)2 = 511 keV


and when K = 1.0 eV, we have K / (K + E0) = (1/512) × 10−3 ≪ 1, so we can use the
nonrelativistic formula. However, it is simpler here to use Equation 6.58:


λ = hp =
hc


K(K + 2E0)
=


1.241 eV · µm
(1.0 eV)[1.0 eV+2(511 keV)]


= 1.23 nm.


If we use nonrelativistic momentum, we obtain the same result because 1 eV is much smaller than the restmass of the electron.
c. For a fast electron with K = 108 keV, relativistic effects cannot be neglected because its total energy is


E = K + E0 = 108 keV + 511 keV = 619 keV and K /E = 108/619 is not negligible:
λ = hp =


hc
K(K + 2E0)


=
1.241 eV · µm


108 keV[108 keV + 2(511 keV)]
= 3.55 pm.


276 Chapter 6 | Photons and Matter Waves


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




6.11


Significance
We see from these estimates that De Broglie’s wavelengths of macroscopic objects such as a ball areimmeasurably small. Therefore, even if they exist, they are not detectable and do not affect the motion ofmacroscopic objects.


Check Your Understanding What is de Broglie’s wavelength of a nonrelativistic proton with a kineticenergy of 1.0 eV?


Using the concept of the electron matter wave, de Broglie provided a rationale for the quantization of the electron’s angularmomentum in the hydrogen atom, which was postulated in Bohr’s quantum theory. The physical explanation for the firstBohr quantization condition comes naturally when we assume that an electron in a hydrogen atom behaves not like a particlebut like a wave. To see it clearly, imagine a stretched guitar string that is clamped at both ends and vibrates in one of itsnormal modes. If the length of the string is l (Figure 6.18), the wavelengths of these vibrations cannot be arbitrary butmust be such that an integer k number of half-wavelengths λ /2 fit exactly on the distance l between the ends. This is the
condition l = kλ /2 for a standing wave on a string. Now suppose that instead of having the string clamped at the walls,
we bend its length into a circle and fasten its ends to each other. This produces a circular string that vibrates in normalmodes, satisfying the same standing-wave condition, but the number of half-wavelengths must now be an even number
k, k = 2n, and the length l is now connected to the radius rn of the circle. This means that the radii are not arbitrary but
must satisfy the following standing-wave condition:


(6.59)2πrn = 2nλ2.
If an electron in the nth Bohr orbit moves as a wave, by Equation 6.59 its wavelength must be equal to λ = 2πrn /n.
Assuming that Equation 6.58 is valid, the electron wave of this wavelength corresponds to the electron’s linearmomentum, p = h /λ = nh / (2πrn) = nℏ /rn. In a circular orbit, therefore, the electron’s angular momentum must be


(6.60)Ln = rn p = rn nℏrn = nℏ.
This equation is the first of Bohr’s quantization conditions, given by Equation 6.36. Providing a physical explanation forBohr’s quantization condition is a convincing theoretical argument for the existence of matter waves.


Figure 6.18 Standing-wave pattern: (a) a stretched string clamped at the walls; (b) an electron wave trapped in the third Bohrorbit in the hydrogen atom.


Example 6.12
The Electron Wave in the Ground State of Hydrogen
Find the de Broglie wavelength of an electron in the ground state of hydrogen.


Chapter 6 | Photons and Matter Waves 277




6.12


Strategy
We combine the first quantization condition in Equation 6.60 with Equation 6.36 and use Equation 6.38for the first Bohr radius with n = 1.
Solution
When n = 1 and rn = a0 = 0.529 Å, the Bohr quantization condition gives a0 p = 1 ·ℏ ⇒ p = ℏ /a0. The
electron wavelength is:


λ = h / p = h /ℏ /a0 = 2πa0 = 2π(0.529 Å) = 3.324 Å.


Significance
We obtain the same result when we use Equation 6.58 directly.


Check Your Understanding Find the de Broglie wavelength of an electron in the third excited state ofhydrogen.


Experimental confirmation of matter waves came in 1927 when C. Davisson and L. Germer performed a series of electron-scattering experiments that clearly showed that electrons do behave like waves. Davisson and Germer did not set up theirexperiment to confirm de Broglie’s hypothesis: The confirmation came as a byproduct of their routine experimental studiesof metal surfaces under electron bombardment.
In the particular experiment that provided the very first evidence of electron waves (known today as the Davisson–Germerexperiment), they studied a surface of nickel. Their nickel sample was specially prepared in a high-temperature oven tochange its usual polycrystalline structure to a form in which large single-crystal domains occupy the volume. Figure 6.19shows the experimental setup. Thermal electrons are released from a heated element (usually made of tungsten) in theelectron gun and accelerated through a potential difference ΔV , becoming a well-collimated beam of electrons produced
by an electron gun. The kinetic energy K of the electrons is adjusted by selecting a value of the potential difference in theelectron gun. This produces a beam of electrons with a set value of linear momentum, in accordance with the conservationof energy:


(6.61)
eΔV = K =


p2


2m
⇒ p = 2meΔV .


The electron beam is incident on the nickel sample in the direction normal to its surface. At the surface, it scatters in variousdirections. The intensity of the beam scattered in a selected direction φ is measured by a highly sensitive detector. The
detector’s angular position with respect to the direction of the incident beam can be varied from φ = 0° to φ = 90°. The
entire setup is enclosed in a vacuum chamber to prevent electron collisions with air molecules, as such thermal collisionswould change the electrons’ kinetic energy and are not desirable.


278 Chapter 6 | Photons and Matter Waves


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 6.19 Schematics of the experimental setup of theDavisson–Germer diffraction experiment. A well-collimatedbeam of electrons is scattered off the nickel target. The kineticenergy of electrons in the incident beam is selected by adjustinga variable potential, ΔV , in the electron gun. Intensity of the
scattered electron beam is measured for a range of scatteringangles φ, whereas the distance between the detector and the
target does not change.


When the nickel target has a polycrystalline form with many randomly oriented microscopic crystals, the incident electronsscatter off its surface in various random directions. As a result, the intensity of the scattered electron beam is much the samein any direction, resembling a diffuse reflection of light from a porous surface. However, when the nickel target has a regularcrystalline structure, the intensity of the scattered electron beam shows a clear maximum at a specific angle and the resultsshow a clear diffraction pattern (see Figure 6.20). Similar diffraction patterns formed by X-rays scattered by variouscrystalline solids were studied in 1912 by father-and-son physicists William H. Bragg and William L. Bragg. The Bragg lawin X-ray crystallography provides a connection between the wavelength λ of the radiation incident on a crystalline lattice,
the lattice spacing, and the position of the interference maximum in the diffracted radiation (see Diffraction).
The lattice spacing of the Davisson–Germer target, determined with X-ray crystallography, was measured to be
a = 2.15 Å. Unlike X-ray crystallography in which X-rays penetrate the sample, in the original Davisson–Germer
experiment, only the surface atoms interact with the incident electron beam. For the surface diffraction, the maximumintensity of the reflected electron beam is observed for scattering angles that satisfy the condition nλ = asinφ (see
Figure 6.21). The first-order maximum (for n = 1 ) is measured at a scattering angle of φ ≈ 50° at ΔV ≈ 54V, which
gives the wavelength of the incident radiation as λ = (2.15 Å)sin50° = 1.64 Å. On the other hand, a 54-V potential
accelerates the incident electrons to kinetic energies of K = 54 eV. Their momentum, calculated from Equation 6.61, is
p = 2.478 × 10−5 eV · s /m. When we substitute this result in Equation 6.58, the de Broglie wavelength is obtained as


(6.62)
λ = hp =


4.136 × 10−15 eV · s
2.478 × 10−5 eV · s /m


= 1.67 Å.


The same result is obtained when we use K = 54 eV in Equation 6.61. The proximity of this theoretical result to the
Davisson–Germer experimental value of λ = 1.64 Å is a convincing argument for the existence of de Broglie matter
waves.


Chapter 6 | Photons and Matter Waves 279




Figure 6.20 The experimental results of electron diffractionon a nickel target for the accelerating potential in the electrongun of about ΔV = 54V : The intensity maximum is
registered at the scattering angle of about φ = 50°.


Figure 6.21 In the surface diffraction of a monochromaticelectromagnetic wave on a crystalline lattice structure, the in-phase incident beams are reflected from atoms on the surface. Aray reflected from the left atom travels an additional distance
D = asinφ to the detector, where a is the lattice spacing. The
reflected beams remain in-phase when D is an integer multipleof their wavelength λ. The intensity of the reflected waves has
pronounced maxima for angles φ satisfying nλ = asinφ.


Diffraction lines measured with low-energy electrons, such as those used in the Davisson–Germer experiment, are quitebroad (see Figure 6.20) because the incident electrons are scattered only from the surface. The resolution of diffractionimages greatly improves when a higher-energy electron beam passes through a thin metal foil. This occurs because thediffraction image is created by scattering off many crystalline planes inside the volume, and the maxima produced inscattering at Bragg angles are sharp (see Figure 6.22).


280 Chapter 6 | Photons and Matter Waves


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 6.22 Diffraction patterns obtained in scattering on a crystalline solid: (a) with X-rays, and (b) with electrons. Theobserved pattern reflects the symmetry of the crystalline structure of the sample.


Since the work of Davisson and Germer, de Broglie’s hypothesis has been extensively tested with various experimentaltechniques, and the existence of de Broglie waves has been confirmed for numerous elementary particles. Neutrons havebeen used in scattering experiments to determine crystalline structures of solids from interference patterns formed byneutron matter waves. The neutron has zero charge and its mass is comparable with the mass of a positively charged proton.Both neutrons and protons can be seen as matter waves. Therefore, the property of being a matter wave is not specific toelectrically charged particles but is true of all particles in motion. Matter waves of molecules as large as carbon C60 have
been measured. All physical objects, small or large, have an associated matter wave as long as they remain in motion. Theuniversal character of de Broglie matter waves is firmly established.
Example 6.13


Neutron Scattering
Suppose that a neutron beam is used in a diffraction experiment on a typical crystalline solid. Estimate the kineticenergy of a neutron (in eV) in the neutron beam and compare it with kinetic energy of an ideal gas in equilibriumat room temperature.
Strategy
We assume that a typical crystal spacing a is of the order of 1.0 Å. To observe a diffraction pattern on sucha lattice, the neutron wavelength λ must be on the same order of magnitude as the lattice spacing. We use
Equation 6.61 to find the momentum p and kinetic energy K. To compare this energy with the energy
ET of ideal gas in equilibrium at room temperature T = 300K, we use the relation K = 32kBT , where
kB = 8.62 × 10


−5 eV/K is the Boltzmann constant.
Solution
We evaluate pc to compare it with the neutron’s rest mass energy E0 = 940MeV :


p = h
λ
⇒ pc = hc


λ
= 1.241 × 10


−6 eV ·m
10−10m


= 12.41 keV.


Chapter 6 | Photons and Matter Waves 281




6.13


We see that p2 c2 ≪ E02 so K ≪ E0 and we can use the nonrelativistic kinetic energy:
K =


p2


2mn
= h


2


2λ2mn
= (6.63 × 10


−34 J · s)2


(2 × 10−20m2)(1.66 × 10−27kg)
= 1.32 × 10−20 J = 82.7 meV.


Kinetic energy of ideal gas in equilibrium at 300 K is:
KT =


3
2
kBT =


3
2
(8.62 × 10−5 eV/K)(300K) = 38.8 MeV.


We see that these energies are of the same order of magnitude.
Significance
Neutrons with energies in this range, which is typical for an ideal gas at room temperature, are called “thermalneutrons.”


Example 6.14
Wavelength of a Relativistic Proton
In a supercollider at CERN, protons can be accelerated to velocities of 0.75c. What are their de Brogliewavelengths at this speed? What are their kinetic energies?
Strategy
The rest mass energy of a proton is E0 = m0 c2 = (1.672 × 10−27kg)(2.998 × 108m/s)2 = 938MeV. When
the proton’s velocity is known, we have β = 0.75 and βγ = 0.75/ 1 − 0.752 = 1.714. We obtain the
wavelength λ and kinetic energy K from relativistic relations.
Solution


λ = hp =
hc
pc =


hc
βγE0


=
1.241 eV · µm


1.714(938MeV)
= 0.77 fm


K = E0(γ − 1) = 938MeV(1/ 1 − 0.75
2 − 1) = 480.1 MeV


Significance
Notice that because a proton is 1835 times more massive than an electron, if this experiment were performed withelectrons, a simple rescaling of these results would give us the electron’s wavelength of (1835)0.77fm = 1.4 pm
and its kinetic energy of 480.1 MeV/1835 = 261.6 keV.


Check Your Understanding Find the de Broglie wavelength and kinetic energy of a free electron thattravels at a speed of 0.75c.


282 Chapter 6 | Photons and Matter Waves


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




6.6 | Wave-Particle Duality
Learning Objectives


By the end of this section, you will be able to:
• Identify phenomena in which electromagnetic waves behave like a beam of photons andparticles behave like waves
• Describe the physics principles behind electron microscopy
• Summarize the evolution of scientific thought that led to the development of quantummechanics


The energy of radiation detected by a radio-signal receiving antenna comes as the energy of an electromagnetic wave. Thesame energy of radiation detected by a photocurrent in the photoelectric effect comes as the energy of individual photonparticles. Therefore, the question arises about the nature of electromagnetic radiation: Is a photon a wave or is it a particle?Similar questions may be asked about other known forms of energy. For example, an electron that forms part of an electriccurrent in a circuit behaves like a particle moving in unison with other electrons inside the conductor. The same electronbehaves as a wave when it passes through a solid crystalline structure and forms a diffraction image. Is an electron awave or is it a particle? The same question can be extended to all particles of matter—elementary particles, as well ascompound molecules—asking about their true physical nature. At our present state of knowledge, such questions about thetrue nature of things do not have conclusive answers. All we can say is that wave-particle duality exists in nature: Undersome experimental conditions, a particle appears to act as a particle, and under different experimental conditions, a particleappears to act a wave. Conversely, under some physical circumstances electromagnetic radiation acts as a wave, and underother physical circumstances, radiation acts as a beam of photons.
This dualistic interpretation is not a new physics concept brought about by specific discoveries in the twentieth century.It was already present in a debate between Isaac Newton and Christiaan Huygens about the nature of light, beginningin the year 1670. According to Newton, a beam of light is a collection of corpuscles of light. According to Huygens,light is a wave. The corpuscular hypothesis failed in 1803, when Thomas Young announced his double-slit interferenceexperiment with light (see Figure 6.23), which firmly established light as a wave. In James Clerk Maxwell’s theory ofelectromagnetism (completed by the year 1873), light is an electromagnetic wave. Maxwell’s classical view of radiationas an electromagnetic wave is still valid today; however, it is unable to explain blackbody radiation and the photoelectriceffect, where light acts as a beam of photons.


Chapter 6 | Photons and Matter Waves 283




Figure 6.23 Young’s double-slit experiment explains theinterference of light by making an analogy with the interferenceof water waves. Two waves are generated at the positions of twoslits in an opaque screen. The waves have the same wavelengths.They travel from their origins at the slits to the viewing screenplaced to the right of the slits. The waves meet on the viewingscreen. At the positions marked “Max” on the screen, themeeting waves are in-phase and the combined wave amplitude isenhanced. At positions marked “Min,” the combined waveamplitude is zero. For light, this mechanism creates a bright-and-dark fringe pattern on the viewing screen.


A similar dichotomy existed in the interpretation of electricity. From Benjamin Franklin’s observations of electricity in1751 until J.J. Thomson’s discovery of the electron in 1897, electric current was seen as a flow in a continuous electricmedium. Within this theory of electric fluid, the present theory of electric circuits was developed, and electromagnetismand electromagnetic induction were discovered. Thomson’s experiment showed that the unit of negative electric charge (anelectron) can travel in a vacuum without any medium to carry the charge around, as in electric circuits. This discoverychanged the way in which electricity is understood today and gave the electron its particle status. In Bohr’s early quantumtheory of the hydrogen atom, both the electron and the proton are particles of matter. Likewise, in the Compton scattering ofX-rays on electrons, the electron is a particle. On the other hand, in electron-scattering experiments on crystalline structures,the electron behaves as a wave.
A skeptic may raise a question that perhaps an electron might always be nothing more than a particle, and that the diffractionimages obtained in electron-scattering experiments might be explained within some macroscopic model of a crystal and amacroscopic model of electrons coming at it like a rain of ping-pong balls. As a matter of fact, to investigate this question,we do not need a complex model of a crystal but just a couple of simple slits in a screen that is opaque to electrons. In otherwords, to gather convincing evidence about the nature of an electron, we need to repeat the Young double-slit experimentwith electrons. If the electron is a wave, we should observe the formation of interference patterns typical for waves, such asthose described in Figure 6.23, even when electrons come through the slits one by one. However, if the electron is a not awave but a particle, the interference fringes will not be formed.
The very first double-slit experiment with a beam of electrons, performed by Claus Jönsson in Germany in 1961,demonstrated that a beam of electrons indeed forms an interference pattern, which means that electrons collectively behaveas a wave. The first double-slit experiments with single electrons passing through the slits one-by-one were performed byGiulio Pozzi in 1974 in Italy and by Akira Tonomura in 1989 in Japan. They show that interference fringes are formedgradually, even when electrons pass through the slits individually. This demonstrates conclusively that electron-diffractionimages are formed because of the wave nature of electrons. The results seen in double-slit experiments with electrons areillustrated by the images of the interference pattern in Figure 6.24.


284 Chapter 6 | Photons and Matter Waves


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




6.14


Figure 6.24 Computer-simulated interference fringes seen in the Young double-slit experiment with electrons. One pattern isgradually formed on the screen, regardless of whether the electrons come through the slits as a beam or individually one-by-one.


Example 6.15
Double-Slit Experiment with Electrons
In one experimental setup for studying interference patterns of electron waves, two slits are created in a gold-coated silicon membrane. Each slit is 62-nm wide and 4-µm long, and the separation between the slits is 272 nm.
The electron beam is created in an electron gun by heating a tungsten element and by accelerating the electronsacross a 600-V potential. The beam is subsequently collimated using electromagnetic lenses, and the collimatedbeam of electrons is sent through the slits. Find the angular position of the first-order bright fringe on the viewingscreen.
Strategy
Recall that the angular position θ of the nth order bright fringe that is formed in Young’s two-slit interference
pattern (discussed in a previous chapter) is related to the separation, d, between the slits and to the wavelength,
λ, of the incident light by the equation dsinθ = nλ, where n = 0, ± 1, ± 2, .... The separation is given and
is equal to d = 272 nm. For the first-order fringe, we take n = 1. The only thing we now need is the wavelength
of the incident electron wave.
Since the electron has been accelerated from rest across a potential difference of ΔV = 600V, its kinetic energy
is K = eΔV = 600 eV. The rest-mass energy of the electron is E0 = 511 keV.
We compute its de Broglie wavelength as that of a nonrelativistic electron because its kinetic energy K is much
smaller than its rest energy E0, K ≪ E0.
Solution
The electron’s wavelength is


λ = hp =
h


2meK
= h


2E0 /c
2K


= hc
2E0K


= 1.241 × 10
−6 eV ·m


2(511 keV)(600 eV)
= 0.050 nm.


This λ is used to obtain the position of the first bright fringe:
sinθ = 1 · λ


d
= 0.050 nm


272 nm
= 0.000184 ⇒ θ = 0.010°.


Significance
Notice that this is also the angular resolution between two consecutive bright fringes up to about n = 1000. For
example, between the zero-order fringe and the first-order fringe, between the first-order fringe and the second-order fringe, and so on.


Check Your Understanding For the situation described in Example 6.15, find the angular position ofthe fifth-order bright fringe on the viewing screen.


The wave-particle dual nature of matter particles and of radiation is a declaration of our inability to describe physical realitywithin one unified classical theory because separately neither a classical particle approach nor a classical wave approachcan fully explain the observed phenomena. This limitation of the classical approach was realized by the year 1928, and afoundation for a new statistical theory, called quantum mechanics, was put in place by Bohr, Edwin Schrödinger, WernerHeisenberg, and Paul Dirac. Quantum mechanics takes de Broglie’s idea of matter waves to be the fundamental property ofall particles and gives it a statistical interpretation. According to this interpretation, a wave that is associated with a particle


Chapter 6 | Photons and Matter Waves 285




carries information about the probable positions of the particle and about its other properties. A single particle is seen as amoving wave packet such as the one shown in Figure 6.25. We can intuitively sense from this example that if a particleis a wave packet, we will not be able to measure its exact position in the same sense as we cannot pinpoint a location ofa wave packet in a vibrating guitar string. The uncertainty, Δx, in measuring the particle’s position is connected to the
uncertainty, Δp, in the simultaneous measuring of its linear momentum by Heisenberg’s uncertainty principle:


(6.63)ΔxΔp ≥ 1
2
ℏ.


Heisenberg’s principle expresses the law of nature that, at the quantum level, our perception is limited. For example, ifwe know the exact position of a body (which means that Δx = 0 in Equation 6.63) at the same time we cannot know
its momentum, because then the uncertainty in its momentum becomes infinite (because Δp ≥ 0.5ℏ /Δx in Equation
6.63). The Heisenberg uncertainty principle sets the limit on the precision of simultaneous measurements of position andmomentum of a particle; it shows that the best precision we can obtain is when we have an equals sign ( = ) in Equation
6.63, and we cannot do better than that, even with the best instruments of the future. Heisenberg’s principle is a consequenceof the wave nature of particles.


Figure 6.25 In this graphic, a particle is shown as a wavepacket and its position does not have an exact value.


We routinely use many electronic devices that exploit wave-particle duality without even realizing the sophistication of thephysics underlying their operation. One example of a technology based on the particle properties of photons and electronsis a charge-coupled device, which is used for light detection in any instrumentation where high-quality digital data arerequired, such as in digital cameras or in medical sensors. An example in which the wave properties of electrons is exploitedis an electron microscope.
In 1931, physicist Ernst Ruska—building on the idea that magnetic fields can direct an electron beam just as lenses candirect a beam of light in an optical microscope—developed the first prototype of the electron microscope. This developmentoriginated the field of electron microscopy. In the transmission electron microscope (TEM), shown in Figure 6.26,electrons are produced by a hot tungsten element and accelerated by a potential difference in an electron gun, which givesthem up to 400 keV in kinetic energy. After leaving the electron gun, the electron beam is focused by electromagneticlenses (a system of condensing lenses) and transmitted through a specimen sample to be viewed. The image of the sampleis reconstructed from the transmitted electron beam. The magnified image may be viewed either directly on a fluorescentscreen or indirectly by sending it, for example, to a digital camera or a computer monitor. The entire setup consisting of theelectron gun, the lenses, the specimen, and the fluorescent screen are enclosed in a vacuum chamber to prevent the energyloss from the beam. Resolution of the TEM is limited only by spherical aberration (discussed in a previous chapter). Modernhigh-resolution models of a TEM can have resolving power greater than 0.5 Å and magnifications higher than 50 milliontimes. For comparison, the best resolving power obtained with light microscopy is currently about 97 nm. A limitation ofthe TEM is that the samples must be about 100-nm thick and biological samples require a special preparation involvingchemical “fixing” to stabilize them for ultrathin slicing.


286 Chapter 6 | Photons and Matter Waves


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 6.26 TEM: An electron beam produced by an electron gun is collimated by condenserlenses and passes through a specimen. The transmitted electrons are projected on a screen andthe image is sent to a camera. (credit: modification of work by Dr. Graham Beards)


Such limitations do not appear in the scanning electron microscope (SEM), which was invented by Manfred von Ardennein 1937. In an SEM, a typical energy of the electron beam is up to 40 keV and the beam is not transmitted through a samplebut is scattered off its surface. Surface topography of the sample is reconstructed by analyzing back-scattered electrons,transmitted electrons, and the emitted radiation produced by electrons interacting with atoms in the sample. The resolvingpower of an SEM is better than 1 nm, and the magnification can be more than 250 times better than that obtained with a lightmicroscope. The samples scanned by an SEM can be as large as several centimeters but they must be specially prepared,depending on electrical properties of the sample.
High magnifications of the TEM and SEM allow us to see individual molecules. High resolving powers of the TEM andSEM allow us to see fine details, such as those shown in the SEM micrograph of pollen at the beginning of this chapter(Figure 6.1).
Example 6.16


Resolving Power of an Electron Microscope
If a 1.0-pm electron beam of a TEM passes through a 2.0-µm circular opening, what is the angle between the
two just-resolvable point sources for this microscope?
Solution
We can directly use a formula for the resolving power, Δθ, of a microscope (discussed in a previous chapter)
when the wavelength of the incident radiation is λ = 1.0 pm and the diameter of the aperture is D = 2.0µm :


Δθ = 1.22 λ
D


= 1.22
1.0 pm
2.0µm


= 6.10 × 10−7 rad = 3.50 × 10−5degree.


Significance
Note that if we used a conventional microscope with a 400-nm light, the resolving power would be only 14°,
which means that all of the fine details in the image would be blurred.


Chapter 6 | Photons and Matter Waves 287




6.15 Check Your Understanding Suppose that the diameter of the aperture in Example 6.16 is halved.How does it affect the resolving power?


288 Chapter 6 | Photons and Matter Waves


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




absorber
absorption spectrum
Balmer formula
Balmer series
blackbody
blackbody radiation
Bohr radius of hydrogen
Bohr’s model of the hydrogen atom
Brackett series
Compton effect
Compton shift
Compton wavelength
cut-off frequency
cut-off wavelength
Davisson–Germer experiment
de Broglie wave
de Broglie’s hypothesis of matter waves
double-slit interference experiment
electron microscopy
emission spectrum
emitter
energy of a photon
energy spectrum of hydrogen
excited energy states of the H atom
Fraunhofer lines
ground state energy of the hydrogen atom
group velocity
Heisenberg uncertainty principle
Humphreys series
hydrogen-like atom
inelastic scattering
ionization energy
ionization limit of the hydrogen atom
Lyman series


CHAPTER 6 REVIEW
KEY TERMS


any object that absorbs radiation
wavelengths of absorbed radiation by atoms and molecules


describes the emission spectrum of a hydrogen atom in the visible-light range
spectral lines corresponding to electron transitions to/from the n = 2 state of the hydrogen atom,


described by the Balmer formula
perfect absorber/emitter


radiation emitted by a blackbody
radius of the first Bohr’s orbit


first quantum model to explain emission spectra of hydrogen
spectral lines corresponding to electron transitions to/from the n = 4 state
the change in wavelength when an X-ray is scattered by its interaction with some materials
difference between the wavelengths of the incident X-ray and the scattered X-ray


physical constant with the value λc = 2.43 pm
frequency of incident light below which the photoelectric effect does not occur
wavelength of incident light that corresponds to cut-off frequency


historically first electron-diffraction experiment that revealed electron waves
matter wave associated with any object that has mass and momentum


particles of matter can behave like waves
Young’s double-slit experiment, which shows the interference of waves


microscopy that uses electron waves to “see” fine details of nano-size objects
wavelengths of emitted radiation by atoms and molecules


any object that emits radiation
quantum of radiant energy, depends only on a photon’s frequency


set of allowed discrete energies of an electron in a hydrogen atom
energy state other than the ground state


dark absorption lines in the continuum solar emission spectrum
energy of an electron in the first Bohr orbit of the hydrogen atom


velocity of a wave, energy travels with the group velocity
sets the limits on precision in simultaneous measurements of momentum andposition of a particle


spectral lines corresponding to electron transitions to/from the n = 6 state
ionized atom with one electron remaining and nucleus with charge +Ze
scattering effect where kinetic energy is not conserved but the total energy is conserved


energy needed to remove an electron from an atom
ionization energy needed to remove an electron from the first Bohr orbit


spectral lines corresponding to electron transitions to/from the ground state


Chapter 6 | Photons and Matter Waves 289




nuclear model of the atom
Paschen series
Pfund series
photocurrent
photoelectric effect
photoelectrode
photoelectron
photon
Planck’s hypothesis of energy quanta
postulates of Bohr’s model
power intensity
propagation vector
quantized energies
quantum number
quantum phenomenon
quantum state of a Planck’s oscillator
reduced Planck’s constant
Rutherford’s gold foil experiment
Rydberg constant for hydrogen
Rydberg formula
scattering angle
Stefan–Boltzmann constant
stopping potential
wave number
wave quantum mechanics
wave-particle duality
work function
α -particle
α -ray
β-ray
γ-ray


heavy positively charged nucleus at the center is surrounded by electrons, proposed byRutherford
spectral lines corresponding to electron transitions to/from the n = 3 state


spectral lines corresponding to electron transitions to/from the n = 5 state
in a circuit, current that flows when a photoelectrode is illuminated


emission of electrons from a metal surface exposed to electromagnetic radiation of the properfrequency
in a circuit, an electrode that emits photoelectrons
electron emitted from a metal surface in the presence of incident radiation


particle of light
energy exchanges between the radiation and the walls take place only in theform of discrete energy quanta


three assumptions that set a frame for Bohr’s model
energy that passes through a unit surface per unit time


vector with magnitude 2π /λ that has the direction of the photon’s linear momentum
discrete energies; not continuous


index that enumerates energy levels
in interaction with matter, photon transfers either all its energy or nothing


any mode of vibration of Planck’s oscillator, enumerated by quantumnumber
Planck’s constant divided by 2π


first experiment to demonstrate the existence of the atomic nucleus
physical constant in the Balmer formula


experimentally found positions of spectral lines of hydrogen atom
angle between the direction of the scattered beam and the direction of the incident beam


physical constant in Stefan’s law
in a circuit, potential difference that stops photocurrent


magnitude of the propagation vector
theory that explains the physics of atoms and subatomic particles


particles can behave as waves and radiation can behave as particles
energy needed to detach photoelectron from the metal surface


doubly ionized helium atom
beam of α -particles (alpha-particles)
beam of electrons
beam of highly energetic photons


KEY EQUATIONS
Wien’s displacement law λmaxT = 2.898 × 10−3m ⋅ K
Stefan’s law P(T) = σAT 4


290 Chapter 6 | Photons and Matter Waves


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Planck’s constant h = 6.626 × 10−34 J ⋅ s = 4.136 × 10−15 eV ⋅ s
Energy quantum of radiation ΔE = h f
Planck’s blackbody radiation law


I(λ, T) = 2πhc
2


λ5
1


e
hc /λkBT − 1


Maximum kinetic energyof a photoelectron Kmax = eΔVs
Energy of a photon E f = h f
Energy balance for photoelectron Kmax = h f − ϕ
Cut-off frequency fc = ϕh
Relativistic invariantenergy equation E2 = p2 c2 + m02 c4
Energy-momentum relationfor photon p f = E fc
Energy of a photon E f = h f = hcλ
Magnitude of photon’s momentum p f = hλ
Photon’s linearmomentum vector p→ f = ℏ k




The Compton wavelengthof an electron λc = hm0 c = 0.00243 nm
The Compton shift Δλ = λc(1 − cosθ)
The Balmer formula 1


λ
= RH


1
22


− 1
n2



The Rydberg formula
1
λ
= RH



⎜ 1
n f
2
− 1
ni
2





⎟, ni = n f + 1, n f + 2, …


Bohr’s first quantization condition Ln = nℏ, n = 1, 2, …
Bohr’s second quantization condition h f = |En − Em|
Bohr’s radius of hydrogen


a0 = 4πε0

2


me e
2
= 0.529Å


Bohr’s radius of the nth orbit rn = a0n2
Ground-state energy value,ionization limit E0 = 18ε02


me e
4


h2
= 13.6 eV


Electron’s energy inthe nth orbit En = − E0 1n2
Ground state energy ofhydrogen E1 = − E0 = − 13.6 eV


Chapter 6 | Photons and Matter Waves 291




The nth orbit ofhydrogen-like ion rn =
a0
Z
n2


The nth energyof hydrogen-like ion En = − Z2E0 1n2
Energy of a matter wave E = h f
The de Broglie wavelength λ = hp
The frequency-wavelength relationfor matter waves λ f =


c
β


Heisenberg’s uncertainty principle Δx Δp ≥ 1
2


SUMMARY
6.1 Blackbody Radiation


• All bodies radiate energy. The amount of radiation a body emits depends on its temperature. The experimentalWien’s displacement law states that the hotter the body, the shorter the wavelength corresponding to the emissionpeak in the radiation curve. The experimental Stefan’s law states that the total power of radiation emitted across theentire spectrum of wavelengths at a given temperature is proportional to the fourth power of the Kelvin temperatureof the radiating body.
• Absorption and emission of radiation are studied within the model of a blackbody. In the classical approach, theexchange of energy between radiation and cavity walls is continuous. The classical approach does not explain theblackbody radiation curve.
• To explain the blackbody radiation curve, Planck assumed that the exchange of energy between radiation and cavitywalls takes place only in discrete quanta of energy. Planck’s hypothesis of energy quanta led to the theoreticalPlanck’s radiation law, which agrees with the experimental blackbody radiation curve; it also explains Wien’s andStefan’s laws.


6.2 Photoelectric Effect
• The photoelectric effect occurs when photoelectrons are ejected from a metal surface in response to monochromaticradiation incident on the surface. It has three characteristics: (1) it is instantaneous, (2) it occurs only when theradiation is above a cut-off frequency, and (3) kinetic energies of photoelectrons at the surface do not depend of theintensity of radiation. The photoelectric effect cannot be explained by classical theory.
• We can explain the photoelectric effect by assuming that radiation consists of photons (particles of light). Eachphoton carries a quantum of energy. The energy of a photon depends only on its frequency, which is the frequencyof the radiation. At the surface, the entire energy of a photon is transferred to one photoelectron.
• The maximum kinetic energy of a photoelectron at the metal surface is the difference between the energy of theincident photon and the work function of the metal. The work function is the binding energy of electrons to themetal surface. Each metal has its own characteristic work function.


6.3 The Compton Effect
• In the Compton effect, X-rays scattered off some materials have different wavelengths than the wavelength of theincident X-rays. This phenomenon does not have a classical explanation.
• The Compton effect is explained by assuming that radiation consists of photons that collide with weakly boundelectrons in the target material. Both electron and photon are treated as relativistic particles. Conservation laws ofthe total energy and of momentum are obeyed in collisions.
• Treating the photon as a particle with momentum that can be transferred to an electron leads to a theoreticalCompton shift that agrees with the wavelength shift measured in the experiment. This provides evidence thatradiation consists of photons.


292 Chapter 6 | Photons and Matter Waves


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




• Compton scattering is an inelastic scattering, in which scattered radiation has a longer wavelength than that ofincident radiation.
6.4 Bohr’s Model of the Hydrogen Atom


• Positions of absorption and emission lines in the spectrum of atomic hydrogen are given by the experimentalRydberg formula. Classical physics cannot explain the spectrum of atomic hydrogen.
• The Bohr model of hydrogen was the first model of atomic structure to correctly explain the radiation spectra ofatomic hydrogen. It was preceded by the Rutherford nuclear model of the atom. In Rutherford’s model, an atomconsists of a positively charged point-like nucleus that contains almost the entire mass of the atom and of negativeelectrons that are located far away from the nucleus.
• Bohr’s model of the hydrogen atom is based on three postulates: (1) an electron moves around the nucleus in acircular orbit, (2) an electron’s angular momentum in the orbit is quantized, and (3) the change in an electron’senergy as it makes a quantum jump from one orbit to another is always accompanied by the emission or absorptionof a photon. Bohr’s model is semi-classical because it combines the classical concept of electron orbit (postulate 1)with the new concept of quantization (postulates 2 and 3).
• Bohr’s model of the hydrogen atom explains the emission and absorption spectra of atomic hydrogen and hydrogen-like ions with low atomic numbers. It was the first model to introduce the concept of a quantum number to describeatomic states and to postulate quantization of electron orbits in the atom. Bohr’s model is an important step in thedevelopment of quantum mechanics, which deals with many-electron atoms.


6.5 De Broglie’s Matter Waves
• De Broglie’s hypothesis of matter waves postulates that any particle of matter that has linear momentum is also awave. The wavelength of a matter wave associated with a particle is inversely proportional to the magnitude of theparticle’s linear momentum. The speed of the matter wave is the speed of the particle.
• De Broglie’s concept of the electron matter wave provides a rationale for the quantization of the electron’s angularmomentum in Bohr’s model of the hydrogen atom.
• In the Davisson–Germer experiment, electrons are scattered off a crystalline nickel surface. Diffraction patterns ofelectron matter waves are observed. They are the evidence for the existence of matter waves. Matter waves areobserved in diffraction experiments with various particles.


6.6 Wave-Particle Duality
• Wave-particle duality exists in nature: Under some experimental conditions, a particle acts as a particle; under otherexperimental conditions, a particle acts as a wave. Conversely, under some physical circumstances, electromagneticradiation acts as a wave, and under other physical circumstances, radiation acts as a beam of photons.
• Modern-era double-slit experiments with electrons demonstrated conclusively that electron-diffraction images areformed because of the wave nature of electrons.
• The wave-particle dual nature of particles and of radiation has no classical explanation.
• Quantum theory takes the wave property to be the fundamental property of all particles. A particle is seen asa moving wave packet. The wave nature of particles imposes a limitation on the simultaneous measurement ofthe particle’s position and momentum. Heisenberg’s uncertainty principle sets the limits on precision in suchsimultaneous measurements.
• Wave-particle duality is exploited in many devices, such as charge-couple devices (used in digital cameras) or in theelectron microscopy of the scanning electron microscope (SEM) and the transmission electron microscope (TEM).


CONCEPTUAL QUESTIONS
6.1 Blackbody Radiation
1. Which surface has a higher temperature – the surface ofa yellow star or that of a red star?
2. Describe what you would see when looking at a body


whose temperature is increased from 1000 K to 1,000,000K.
3. Explain the color changes in a hot body as itstemperature is increased.


Chapter 6 | Photons and Matter Waves 293




4. Speculate as to why UV light causes sunburn, whereasvisible light does not.
5. Two cavity radiators are constructed with walls made ofdifferent metals. At the same temperature, how would theirradiation spectra differ?
6. Discuss why some bodies appear black, other bodiesappear red, and still other bodies appear white.
7. If everything radiates electromagnetic energy, why canwe not see objects at room temperature in a dark room?
8. How much does the power radiated by a blackbodyincrease when its temperature (in K) is tripled?


6.2 Photoelectric Effect
9. For the same monochromatic light source, would thephotoelectric effect occur for all metals?
10. In the interpretation of the photoelectric effect, howis it known that an electron does not absorb more than onephoton?
11. Explain how you can determine the work functionfrom a plot of the stopping potential versus the frequency ofthe incident radiation in a photoelectric effect experiment.Can you determine the value of Planck’s constant from thisplot?
12. Suppose that in the photoelectric-effect experimentwe make a plot of the detected current versus the appliedpotential difference. What information do we obtain fromsuch a plot? Can we determine from it the value of Planck’sconstant? Can we determine the work function of themetal?
13. Speculate how increasing the temperature of aphotoelectrode affects the outcomes of the photoelectriceffect experiment.
14. Which aspects of the photoelectric effect cannot beexplained by classical physics?
15. Is the photoelectric effect a consequence of the wavecharacter of radiation or is it a consequence of the particlecharacter of radiation? Explain briefly.
16. The metals sodium, iron, and molybdenum have workfunctions 2.5 eV, 3.9 eV, and 4.2 eV, respectively. Whichof these metals will emit photoelectrons when illuminatedwith 400 nm light?


6.3 The Compton Effect
17. Discuss any similarities and differences between thephotoelectric and the Compton effects.
18. Which has a greater momentum: an UV photon or anIR photon?
19. Does changing the intensity of a monochromatic lightbeam affect the momentum of the individual photons in thebeam? Does such a change affect the net momentum of thebeam?
20. Can the Compton effect occur with visible light? If so,will it be detectable?
21. Is it possible in the Compton experiment to observescattered X-rays that have a shorter wavelength than theincident X-ray radiation?
22. Show that the Compton wavelength has the dimensionof length.
23. At what scattering angle is the wavelength shift in theCompton effect equal to the Compton wavelength?


6.4 Bohr’s Model of the Hydrogen Atom
24. Explain why the patterns of bright emission spectrallines have an identical spectral position to the pattern ofdark absorption spectral lines for a given gaseous element.
25. Do the various spectral lines of the hydrogen atomoverlap?
26. The Balmer series for hydrogen was discovered beforeeither the Lyman or the Paschen series. Why?
27. When the absorption spectrum of hydrogen at roomtemperature is analyzed, absorption lines for the Lymanseries are found, but none are found for the Balmer series.What does this tell us about the energy state of mosthydrogen atoms at room temperature?
28. Hydrogen accounts for about 75% by mass of thematter at the surfaces of most stars. However, theabsorption lines of hydrogen are strongest (of highestintensity) in the spectra of stars with a surface temperatureof about 9000 K. They are weaker in the sun spectrumand are essentially nonexistent in very hot (temperaturesabove 25,000 K) or rather cool (temperatures below 3500K) stars. Speculate as to why surface temperature affectsthe hydrogen absorption lines that we observe.
29. Discuss the similarities and differences between


294 Chapter 6 | Photons and Matter Waves


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Thomson’s model of the hydrogen atom and Bohr’s modelof the hydrogen atom.
30. Discuss the way in which Thomson’s model isnonphysical. Support your argument with experimentalevidence.
31. If, in a hydrogen atom, an electron moves to an orbitwith a larger radius, does the energy of the hydrogen atomincrease or decrease?
32. How is the energy conserved when an atom makes atransition from a higher to a lower energy state?
33. Suppose an electron in a hydrogen atom makes atransition from the (n+1)th orbit to the nth orbit. Is thewavelength of the emitted photon longer for larger valuesof n, or for smaller values of n?
34. Discuss why the allowed energies of the hydrogenatom are negative.
35. Can a hydrogen atom absorb a photon whose energy isgreater than 13.6 eV?
36. Why can you see through glass but not through wood?
37. Do gravitational forces have a significant effect onatomic energy levels?
38. Show that Planck’s constant has the dimensions ofangular momentum.


6.5 De Broglie’s Matter Waves
39. Which type of radiation is most suitable for theobservation of diffraction patterns on crystalline solids;radio waves, visible light, or X-rays? Explain.
40. Speculate as to how the diffraction patterns of a typicalcrystal would be affected if γ-rays were used instead of X-
rays.
41. If an electron and a proton are traveling at the same


speed, which one has the shorter de Broglie wavelength?
42. If a particle is accelerating, how does this affect its deBroglie wavelength?
43. Why is the wave-like nature of matter not observedevery day for macroscopic objects?
44. What is the wavelength of a neutron at rest? Explain.
45. Why does the setup of Davisson–Germer experimentneed to be enclosed in a vacuum chamber? Discuss whatresult you expect when the chamber is not evacuated.


6.6 Wave-Particle Duality
46. Give an example of an experiment in which lightbehaves as waves. Give an example of an experiment inwhich light behaves as a stream of photons.
47. Discuss: How does the interference of water wavesdiffer from the interference of electrons? How are theyanalogous?
48. Give at least one argument in support of the matter-wave hypothesis.
49. Give at least one argument in support of the particle-nature of radiation.
50. Explain the importance of the Young double-slitexperiment.
51. Does the Heisenberg uncertainty principle allow aparticle to be at rest in a designated region in space?
52. Can the de Broglie wavelength of a particle be knownexactly?
53. Do the photons of red light produce better resolutionin a microscope than blue light photons? Explain.
54. Discuss the main difference between an SEM and aTEM.


PROBLEMS
6.1 Blackbody Radiation
55. A 200-W heater emits a 1.5-µm radiation. (a) Whatvalue of the energy quantum does it emit? (b) Assumingthat the specific heat of a 4.0-kg body is 0.83kcal /kg · K,
how many of these photons must be absorbed by the body


to increase its temperature by 2 K? (c) How long doesthe heating process in (b) take, assuming that all radiationemitted by the heater gets absorbed by the body?
56. A 900-W microwave generator in an oven generatesenergy quanta of frequency 2560 MHz. (a) How manyenergy quanta does it emit per second? (b) How many


Chapter 6 | Photons and Matter Waves 295




energy quanta must be absorbed by a pasta dish placed inthe radiation cavity to increase its temperature by 45.0 K?Assume that the dish has a mass of 0.5 kg and that itsspecific heat is 0.9 kcal /kg · K. (c) Assume that all energy
quanta emitted by the generator are absorbed by the pastadish. How long must we wait until the dish in (b) is ready?
57. (a) For what temperature is the peak of blackbodyradiation spectrum at 400 nm? (b) If the temperature of ablackbody is 800 K, at what wavelength does it radiate themost energy?
58. The tungsten elements of incandescent light bulbsoperate at 3200 K. At what frequency does the filamentradiate maximum energy?
59. Interstellar space is filled with radiation of wavelength
970µm. This radiation is considered to be a remnant of
the “big bang.” What is the corresponding blackbodytemperature of this radiation?
60. The radiant energy from the sun reaches its maximumat a wavelength of about 500.0 nm. What is theapproximate temperature of the sun’s surface?


6.2 Photoelectric Effect
61. A photon has energy 20 keV. What are its frequencyand wavelength?
62. The wavelengths of visible light range fromapproximately 400 to 750 nm. What is the correspondingrange of photon energies for visible light?
63. What is the longest wavelength of radiation that caneject a photoelectron from silver? Is it in the visible range?
64. What is the longest wavelength of radiation that caneject a photoelectron from potassium, given the workfunction of potassium 2.24 eV? Is it in the visible range?
65. Estimate the binding energy of electrons inmagnesium, given that the wavelength of 337 nm is thelongest wavelength that a photon may have to eject aphotoelectron from magnesium photoelectrode.
66. The work function for potassium is 2.26 eV. Whatis the cutoff frequency when this metal is used asphotoelectrode? What is the stopping potential when for theemitted electrons when this photoelectrode is exposed toradiation of frequency 1200 THz?
67. Estimate the work function of aluminum, given thatthe wavelength of 304 nm is the longest wavelength that aphoton may have to eject a photoelectron from aluminum


photoelectrode.
68. What is the maximum kinetic energy of photoelectronsejected from sodium by the incident radiation ofwavelength 450 nm?
69. A 120-nm UV radiation illuminates a gold-platedelectrode. What is the maximum kinetic energy of theejected photoelectrons?
70. A 400-nm violet light ejects photoelectrons with amaximum kinetic energy of 0.860 eV from sodiumphotoelectrode. What is the work function of sodium?
71. A 600-nm light falls on a photoelectric surface andelectrons with the maximum kinetic energy of 0.17 eV areemitted. Determine (a) the work function and (b) the cutofffrequency of the surface. (c) What is the stopping potentialwhen the surface is illuminated with light of wavelength400 nm?
72. The cutoff wavelength for the emission ofphotoelectrons from a particular surface is 500 nm. Findthe maximum kinetic energy of the ejected photoelectronswhen the surface is illuminated with light of wavelength600 nm.
73. Find the wavelength of radiation that can eject2.00-eV electrons from calcium electrode. The workfunction for calcium is 2.71 eV. In what range is thisradiation?
74. Find the wavelength of radiation that can eject0.10-eV electrons from potassium electrode. The workfunction for potassium is 2.24 eV. In what range is thisradiation?
75. Find the maximum velocity of photoelectrons ejectedby an 80-nm radiation, if the work function ofphotoelectrode is 4.73 eV.


6.3 The Compton Effect
76. What is the momentum of a 589-nm yellow photon?
77. What is the momentum of a 4-cm microwave photon?
78. In a beam of white light (wavelengths from 400 to 750nm), what range of momentum can the photons have?
79. What is the energy of a photon whose momentum is
3.0 × 10−24kg ·m/s ?
80. What is the wavelength of (a) a 12-keV X-ray photon;


296 Chapter 6 | Photons and Matter Waves


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




(b) a 2.0-MeV γ -ray photon?
81. Find the momentum and energy of a 1.0-Å photon.
82. Find the wavelength and energy of a photon with
momentum 5.00 × 10−29kg ·m/s.
83. A γ -ray photon has a momentum of
8.00 × 10−21kg ·m/s. Find its wavelength and energy.
84. (a) Calculate the momentum of a 2.5-µm photon. (b)
Find the velocity of an electron with the same momentum.(c) What is the kinetic energy of the electron, and how doesit compare to that of the photon?
85. Show that p = h /λ and E f = h f are consistent with
the relativistic formula E2 = p2 c2 + m02 c2.
86. Show that the energy E in eV of a photon is given by
E = 1.241 × 10−6 eV ·m/λ, where λ is its wavelength
in meters.
87. For collisions with free electrons, compare theCompton shift of a photon scattered as an angle of 30° to
that of a photon scattered at 45°.
88. X-rays of wavelength 12.5 pm are scattered from ablock of carbon. What are the wavelengths of photonsscattered at (a) 30°; (b) 90°; and, (c) 180° ?


6.4 Bohr’s Model of the Hydrogen Atom
89. Calculate the wavelength of the first line in the Lymanseries and show that this line lies in the ultraviolet part ofthe spectrum.
90. Calculate the wavelength of the fifth line in the Lymanseries and show that this line lies in the ultraviolet part ofthe spectrum.
91. Calculate the energy changes corresponding to thetransitions of the hydrogen atom: (a) from n = 3 to
n = 4; (b) from n = 2 to n = 1; and (c) from n = 3 to
n = ∞.


92. Determine the wavelength of the third Balmer line(transition from n = 5 to n = 2 ).
93. What is the frequency of the photon absorbed whenthe hydrogen atom makes the transition from the ground


state to the n = 4 state?
94. When a hydrogen atom is in its ground state, what arethe shortest and longest wavelengths of the photons it canabsorb without being ionized?
95. When a hydrogen atom is in its third excided state,what are the shortest and longest wavelengths of thephotons it can emit?
96. What is the longest wavelength that light can have if itis to be capable of ionizing the hydrogen atom in its groundstate?
97. For an electron in a hydrogen atom in the n = 2 state,
compute: (a) the angular momentum; (b) the kinetic energy;(c) the potential energy; and (d) the total energy.
98. Find the ionization energy of a hydrogen atom in thefourth energy state.
99. It has been measured that it required 0.850 eV toremove an electron from the hydrogen atom. In what statewas the atom before the ionization happened?
100. What is the radius of a hydrogen atom when theelectron is in the first excited state?
101. Find the shortest wavelength in the Balmer series. Inwhat part of the spectrum does this line lie?
102. Show that the entire Paschen series lies in theinfrared part of the spectrum.
103. Do the Balmer series and the Lyman series overlap?Why? Why not? (Hint: calculate the shortest Balmer lineand the longest Lyman line.)
104. (a) Which line in the Balmer series is the first onein the UV part of the spectrum? (b) How many Balmerlines lie in the visible part of the spectrum? (c) How manyBalmer lines lie in the UV?
105. A 4.653-µm emission line of atomic hydrogen
corresponds to transition between the states n f = 5 and
ni. Find ni.


6.5 De Broglie’s Matter Waves
106. At what velocity will an electron have a wavelengthof 1.00 m?
107. What is the de Broglie wavelength of an electron


Chapter 6 | Photons and Matter Waves 297




travelling at a speed of 5.0 × 106m/s ?
108. What is the de Broglie wavelength of an electron thatis accelerated from rest through a potential difference of 20keV?
109. What is the de Broglie wavelength of a proton whosekinetic energy is 2.0 MeV? 10.0 MeV?
110. What is the de Broglie wavelength of a 10-kg footballplayer running at a speed of 8.0 m/s?
111. (a) What is the energy of an electron whose deBroglie wavelength is that of a photon of yellow light withwavelength 590 nm? (b) What is the de Broglie wavelengthof an electron whose energy is that of the photon of yellowlight?
112. The de Broglie wavelength of a neutron is 0.01 nm.What is the speed and energy of this neutron?
113. What is the wavelength of an electron that is movingat a 3% of the speed of light?
114. At what velocity does a proton have a 6.0-fmwavelength (about the size of a nucleus)? Give your answerin units of c.
115. What is the velocity of a 0.400-kg billiard ball if itswavelength is 7.50 fm?
116. Find the wavelength of a proton that is moving at1.00% of the speed of light (when β = 0.01).


6.6 Wave-Particle Duality
117. An AM radio transmitter radiates 500 kW at afrequency of 760 kHz. How many photons per second doesthe emitter emit?
118. Find the Lorentz factor γ and de Broglie’s
wavelength for a 50-GeV electron in a particle accelerator.
119. Find the Lorentz factor γ and de Broglie’s
wavelength for a 1.0-TeV proton in a particle accelerator.
120. What is the kinetic energy of a 0.01-nm electron in aTEM?
121. If electron is to be diffracted significantly by acrystal, its wavelength must be about equal to the spacing,d, of crystalline planes. Assuming d = 0.250 nm,
estimate the potential difference through which an electron


must be accelerated from rest if it is to be diffracted bythese planes.
122. X-rays form ionizing radiation that is dangerous toliving tissue and undetectable to the human eye. Supposethat a student researcher working in an X-ray diffractionlaboratory is accidentally exposed to a fatal dose ofradiation. Calculate the temperature increase of theresearcher under the following conditions: the energy ofX-ray photons is 200 keV and the researcher absorbs
4 × 1013 photons per each kilogram of body weight
during the exposure. Assume that the specific heat of thestudent’s body is 0.83kcal /kg · K.
123. Solar wind (radiation) that is incident on the topof Earth’s atmosphere has an average intensity of
1.3kW/m2. Suppose that you are building a solar sail that
is to propel a small toy spaceship with a mass of 0.1 kg inthe space between the International Space Station and themoon. The sail is made from a very light material, whichperfectly reflects the incident radiation. To assess whethersuch a project is feasible, answer the following questions,assuming that radiation photons are incident only in normaldirection to the sail reflecting surface. (a) What is the
radiation pressure (force per m2 ) of the radiation falling
on the mirror-like sail? (b) Given the radiation pressurecomputed in (a), what will be the acceleration of the
spaceship when the sail has of an area of 10.0m2 ? (c)
Given the acceleration estimate in (b), how fast will thespaceship be moving after 24 hours when it starts fromrest?
124. Treat the human body as a blackbody and determinethe percentage increase in the total power of its radiationwhen its temperature increases from 98.6 ° F to 103 ° F.
125. Show that Wien’s displacement law results fromPlanck’s radiation law. (Hint: substitute x = hc /λkT and
write Planck’s law in the form I(x, T) = Ax5 / (ex − 1),
where A = 2π(kT)5 / (h4 c3). Now, for fixed T, find the
position of the maximum in I(x,T) by solving for x in theequation dI(x, T) /dx = 0. )
126. Show that Stefan’s law results from Planck’sradiation law. Hint: To compute the total power ofblackbody radiation emitted across the entire spectrum ofwavelengths at a given temperature, integrate Planck’s law
over the entire spectrum P(T) = ∫


0



I(λ, T)dλ. Use the


substitution x = hc /λkT and the tabulated value of the
integral ∫


0



dxx3 / (ex − 1) = π4 /15.


298 Chapter 6 | Photons and Matter Waves


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




ADDITIONAL PROBLEMS
127. Determine the power intensity of radiation per unitwavelength emitted at a wavelength of 500.0 nm by ablackbody at a temperature of 10,000 K.
128. The HCl molecule oscillates at a frequency of 87.0THz. What is the difference (in eV) between its adjacentenergy levels?
129. A quantum mechanical oscillator vibrates at afrequency of 250.0 THz. What is the minimum energy ofradiation it can emit?
130. In about 5 billion years, the sun will evolve to a redgiant. Assume that its surface temperature will decrease toabout half its present value of 6000 K, while its present
radius of 7.0 × 108m will increase to 1.5 × 1011m
(which is the current Earth-sun distance). Calculate theratio of the total power emitted by the sun in its red giantstage to its present power.
131. A sodium lamp emits 2.0 W of radiant energy, mostof which has a wavelength of about 589 nm. Estimate thenumber of photons emitted per second by the lamp.
132. Photoelectrons are ejected from a photoelectrode andare detected at a distance of 2.50 cm away from thephotoelectrode. The work function of the photoelectrode is2.71 eV and the incident radiation has a wavelength of 420nm. How long does it take a photoelectron to travel to thedetector?
133. If the work function of a metal is 3.2 eV, what isthe maximum wavelength that a photon can have to eject aphotoelectron from this metal surface?
134. The work function of a photoelectric surface is 2.00eV. What is the maximum speed of the photoelectronsemitted from this surface when a 450-nm light falls on it?
135. A 400-nm laser beam is projected onto a calciumelectrode. The power of the laser beam is 2.00 mW andthe work function of calcium is 2.31 eV. (a) How manyphotoelectrons per second are ejected? (b) What net poweris carried away by photoelectrons?
136. (a) Calculate the number of photoelectrons persecond that are ejected from a 1.00-mm2 area of sodium
metal by a 500-nm radiation with intensity 1.30kW/m2
(the intensity of sunlight above Earth’s atmosphere). (b)Given the work function of the metal as 2.28 eV, whatpower is carried away by these photoelectrons?
137. A laser with a power output of 2.00 mW at a 400-nm


wavelength is used to project a beam of light onto a calciumphotoelectrode. (a) How many photoelectrons leave thecalcium surface per second? (b) What power is carriedaway by ejected photoelectrons, given that the workfunction of calcium is 2.31 eV? (c) Calculate thephotocurrent. (d) If the photoelectrode suddenly becomeselectrically insulated and the setup of two electrodes in thecircuit suddenly starts to act like a 2.00-pF capacitor, howlong will current flow before the capacitor voltage stops it?
138. The work function for barium is 2.48 eV. Find themaximum kinetic energy of the ejected photoelectronswhen the barium surface is illuminated with: (a) radiationemitted by a 100-kW radio station broadcasting at 800kHz; (b) a 633-nm laser light emitted from a powerful He-Ne laser; and (c) a 434-nm blue light emitted by a smallhydrogen gas discharge tube.
139. (a) Calculate the wavelength of a photon that has thesame momentum as a proton moving with 1% of the speedof light in a vacuum. (b) What is the energy of this photonin MeV? (c) What is the kinetic energy of the proton inMeV?
140. (a) Find the momentum of a 100-keV X-ray photon.(b) Find the velocity of a neutron with the samemomentum. (c) What is the neutron’s kinetic energy in eV?
141. The momentum of light, as it is for particles, isexactly reversed when a photon is reflected straight backfrom a mirror, assuming negligible recoil of the mirror.The change in momentum is twice the photon’s incidentmomentum, as it is for the particles. Suppose that a beam of
light has an intensity 1.0kW/m2 and falls on a −2.0-m2
area of a mirror and reflects from it. (a) Calculate theenergy reflected in 1.00 s. (b) What is the momentumimparted to the mirror? (c) Use Newton’s second law tofind the force on the mirror. (d) Does the assumption of no-recoil for the mirror seem reasonable?
142. A photon of energy 5.0 keV collides with a stationaryelectron and is scattered at an angle of 60°. What is the
energy acquired by the electron in the collision?
143. A 0.75-nm photon is scattered by a stationaryelectron. The speed of the electron’s recoil is
1.5 × 106m/s. (a) Find the wavelength shift of the photon.
(b) Find the scattering angle of the photon.
144. Find the maximum change in X-ray wavelength thatcan occur due to Compton scattering. Does this changedepend on the wavelength of the incident beam?


Chapter 6 | Photons and Matter Waves 299




145. A photon of wavelength 700 nm is incident on ahydrogen atom. When this photon is absorbed, the atombecomes ionized. What is the lowest possible orbit that theelectron could have occupied before being ionized?
146. What is the maximum kinetic energy of an electronsuch that a collision between the electron and a stationaryhydrogen atom in its ground state is definitely elastic?
147. Singly ionized atomic helium He+1 is a hydrogen-
like ion. (a) What is its ground-state radius? (b) Calculatethe energies of its four lowest energy states. (c) Repeat the
calculations for the Li2 + ion.
148. A triply ionized atom of beryllium Be3 + is a
hydrogen-like ion. When Be3 + is in one of its excited
states, its radius in this nth state is exactly the same asthe radius of the first Bohr orbit of hydrogen. Find n and
compute the ionization energy for this state of Be3 + .
149. In extreme-temperature environments, such as thoseexisting in a solar corona, atoms may be ionized byundergoing collisions with other atoms. One example of
such ionization in the solar corona is the presence of C5 +
ions, detected in the Fraunhofer spectrum. (a) By what
factor do the energies of the C5 + ion scale compare to
the energy spectrum of a hydrogen atom? (b) What is the
wavelength of the first line in the Paschen series of C5 +
? (c) In what part of the spectrum are these lines located?
150. (a) Calculate the ionization energy for He+ . (b)
What is the minimum frequency of a photon capable of
ionizing He+ ?
151. Experiments are performed with ultracold neutronshaving velocities as small as 1.00 m/s. Find the wavelengthof such an ultracold neutron and its kinetic energy.
152. Find the velocity and kinetic energy of a 6.0-fmneutron. (Rest mass energy of neutron is E0 = 940MeV.)


153. The spacing between crystalline planes in the NaClcrystal is 0.281 nm, as determined by X-ray diffractionwith X-rays of wavelength 0.170 nm. What is the energyof neutrons in the neutron beam that produces diffractionpeaks at the same locations as the peaks obtained with theX-rays?
154. What is the wavelength of an electron acceleratedfrom rest in a 30.0-kV potential difference?


155. Calculate the velocity of a 1.0-µm electron and a
potential difference used to accelerate it from rest to thisvelocity.
156. In a supercollider at CERN, protons are acceleratedto velocities of 0.25c. What are their wavelengths at thisspeed? What are their kinetic energies? If a beam of protonswere to gain its kinetic energy in only one pass througha potential difference, how high would this potentialdifference have to be? (Rest mass energy of a proton is
E0 = 938MeV).


157. Find the de Broglie wavelength of an electronaccelerated from rest in an X-ray tube in the potentialdifference of 100 keV. (Rest mass energy of an electron is
E0 = 511 keV.)


158. The cutoff wavelength for the emission ofphotoelectrons from a particular surface is 500 nm. Findthe maximum kinetic energy of the ejected photoelectronswhen the surface is illuminated with light of wavelength450 nm.
159. Compare the wavelength shift of a photon scatteredby a free electron to that of a photon scattered at the sameangle by a free proton.
160. The spectrometer used to measure the wavelengths ofthe scattered X-rays in the Compton experiment is accurate
to 5.0 × 10−4nm. What is the minimum scattering angle
for which the X-rays interacting with the free electrons canbe distinguished from those interacting with the atoms?
161. Consider a hydrogen-like ion where an electron isorbiting a nucleus that has charge q = + Ze. Derive the
formulas for the energy En of the electron in nth orbit and
the orbital radius rn.
162. Assume that a hydrogen atom exists in the n = 2
excited state for 10−8 s before decaying to the ground
state. How many times does the electron orbit the protonnucleus during this time? How long does it take Earth toorbit the sun this many times?
163. An atom can be formed when a negative muon iscaptured by a proton. The muon has the same charge as theelectron and a mass 207 times that of the electron. Calculatethe frequency of the photon emitted when this atom makesthe transition from n = 2 to the n = 1 state. Assume that
the muon is orbiting a stationary proton.


300 Chapter 6 | Photons and Matter Waves


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




7 | QUANTUM MECHANICS


Figure 7.1 A D-wave qubit processor: The brain of a quantum computer that encodes information in quantum bits to performcomplex calculations. (credit: modification of work by D-Wave Systems, Inc.)


Chapter Outline
7.1 Wave Functions
7.2 The Heisenberg Uncertainty Principle
7.3 The Schrӧdinger Equation
7.4 The Quantum Particle in a Box
7.5 The Quantum Harmonic Oscillator
7.6 The Quantum Tunneling of Particles through Potential Barriers


Introduction
Quantum mechanics is a powerful framework for understanding the motions and interactions of particles at small scales,such as atoms and molecules. The ideas behind quantum mechanics often appear quite strange. In many ways, our everydayexperience with the macroscopic physical world does not prepare us for the microscopic world of quantum mechanics. Thepurpose of this chapter is to introduce you to this exciting world.
Pictured above is a quantum-computer processor. This device is the “brain” of a quantum computer that operates at near-absolute zero temperatures. Unlike a digital computer, which encodes information in binary digits (definite states of eitherzero or one), a quantum computer encodes information in quantum bits or qubits (mixed states of zero and one). Quantumcomputers are discussed in the first section of this chapter.


Chapter 7 | Quantum Mechanics 301




7.1 | Wave Functions
Learning Objectives


By the end of this section, you will be able to:
• Describe the statistical interpretation of the wave function
• Use the wave function to determine probabilities
• Calculate expectation values of position, momentum, and kinetic energy


In the preceding chapter, we saw that particles act in some cases like particles and in other cases like waves. But what doesit mean for a particle to “act like a wave”? What precisely is “waving”? What rules govern how this wave changes andpropagates? How is the wave function used to make predictions? For example, if the amplitude of an electron wave is givenby a function of position and time, Ψ(x, t) , defined for all x, where exactly is the electron? The purpose of this chapter is
to answer these questions.
Using the Wave Function
A clue to the physical meaning of the wave function Ψ(x, t) is provided by the two-slit interference of monochromatic light
(Figure 7.2). (See also Electromagnetic Waves (http://cnx.org/content/m58495/latest/) and Interference.) The
wave function of a light wave is given by E(x,t), and its energy density is given by |E|2 , where E is the electric field
strength. The energy of an individual photon depends only on the frequency of light, εphoton = h f , so |E|2 is proportional
to the number of photons. When light waves from S1 interfere with light waves from S2 at the viewing screen (a distance
D away), an interference pattern is produced (part (a) of the figure). Bright fringes correspond to points of constructiveinterference of the light waves, and dark fringes correspond to points of destructive interference of the light waves (part(b)).
Suppose the screen is initially unexposed to light. If the screen is exposed to very weak light, the interference pattern appearsgradually (Figure 7.2(c), left to right). Individual photon hits on the screen appear as dots. The dot density is expected tobe large at locations where the interference pattern will be, ultimately, the most intense. In other words, the probability (perunit area) that a single photon will strike a particular spot on the screen is proportional to the square of the total electric field,
|E|2 at that point. Under the right conditions, the same interference pattern develops for matter particles, such as electrons.


302 Chapter 7 | Quantum Mechanics


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 7.2 Two-slit interference of monochromatic light. (a) Schematic of two-slit interference; (b) light interference pattern;(c) interference pattern built up gradually under low-intensity light (left to right).
Visit this interactive simulation (https://openstaxcollege.org/l/21intquawavint) to learn more aboutquantum wave interference.


The square of the matter wave |Ψ|2 in one dimension has a similar interpretation as the square of the electric field |E|2 . It
gives the probability that a particle will be found at a particular position and time per unit length, also called the probabilitydensity. The probability (P) a particle is found in a narrow interval (x, x + dx) at time t is therefore


(7.1)P(x, x + dx) = |Ψ(x, t)|2dx.
(Later, we define the magnitude squared for the general case of a function with “imaginary parts.”) This probabilisticinterpretation of the wave function is called the Born interpretation. Examples of wave functions and their squares for aparticular time t are given in Figure 7.3.


Chapter 7 | Quantum Mechanics 303




Figure 7.3 Several examples of wave functions and thecorresponding square of their wave functions.


If the wave function varies slowly over the interval Δx , the probability a particle is found in the interval is approximately
(7.2)P(x, x + Δx) ≈ |Ψ(x, t)|2Δx.


Notice that squaring the wave function ensures that the probability is positive. (This is analogous to squaring the electricfield strength—which may be positive or negative—to obtain a positive value of intensity.) However, if the wave functiondoes not vary slowly, we must integrate:
(7.3)


P(x, x + Δx) = ∫
x


x + Δx


|Ψ(x, t)|2dx.


This probability is just the area under the function |Ψ(x, t)|2 between x and x + Δx . The probability of finding the particle
“somewhere” (the normalization condition) is


(7.4)
P(−∞, +∞) = ∫


−∞




|Ψ(x, t)|2dx = 1.


For a particle in two dimensions, the integration is over an area and requires a double integral; for a particle in threedimensions, the integration is over a volume and requires a triple integral. For now, we stick to the simple one-dimensionalcase.
Example 7.1


Where Is the Ball? (Part I)
A ball is constrained to move along a line inside a tube of length L. The ball is equally likely to be found anywherein the tube at some time t. What is the probability of finding the ball in the left half of the tube at that time? (Theanswer is 50%, of course, but how do we get this answer by using the probabilistic interpretation of the quantummechanical wave function?)
Strategy
The first step is to write down the wave function. The ball is equally like to be found anywhere in the box, so one


304 Chapter 7 | Quantum Mechanics


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




way to describe the ball with a constant wave function (Figure 7.4). The normalization condition can be used tofind the value of the function and a simple integration over half of the box yields the final answer.


Figure 7.4 Wave function for a ball in a tube of length L.


Solution
The wave function of the ball can be written as Ψ(x, t) = C(0 < x < L), where C is a constant, and
Ψ(x, t) = 0 otherwise. We can determine the constant C by applying the normalization condition (we set t = 0
to simplify the notation):


P(x = −∞, +∞) = ∫
−∞




|C|2dx = 1.


This integral can be broken into three parts: (1) negative infinity to zero, (2) zero to L, and (3) L to infinity. Theparticle is constrained to be in the tube, so C = 0 outside the tube and the first and last integrations are zero. The
above equation can therefore be written


P(x = 0, L) = ∫
0


L


|C|2dx = 1.


The value C does not depend on x and can be taken out of the integral, so we obtain
|C|2∫


0


L


dx = 1.


Integration gives
C = 1


L
.


To determine the probability of finding the ball in the first half of the box (0 < x < L), we have


P(x = 0, L/2) = ⌠

0


L/2


| 1L |
2
dx = ⎛⎝


1
L


L
2
= 0.50.


Significance
The probability of finding the ball in the first half of the tube is 50%, as expected. Two observations arenoteworthy. First, this result corresponds to the area under the constant function from x = 0 to L/2 (the area of a
square left of L/2). Second, this calculation requires an integration of the square of the wave function. A commonmistake in performing such calculations is to forget to square the wave function before integration.


Chapter 7 | Quantum Mechanics 305




Example 7.2
Where Is the Ball? (Part II)
A ball is again constrained to move along a line inside a tube of length L. This time, the ball is found preferentiallyin the middle of the tube. One way to represent its wave function is with a simple cosine function (Figure 7.5).What is the probability of finding the ball in the last one-quarter of the tube?


Figure 7.5 Wave function for a ball in a tube of length L,where the ball is preferentially in the middle of the tube.


Strategy
We use the same strategy as before. In this case, the wave function has two unknown constants: One is associatedwith the wavelength of the wave and the other is the amplitude of the wave. We determine the amplitude by usingthe boundary conditions of the problem, and we evaluate the wavelength by using the normalization condition.Integration of the square of the wave function over the last quarter of the tube yields the final answer. Thecalculation is simplified by centering our coordinate system on the peak of the wave function.
Solution
The wave function of the ball can be written


Ψ(x, 0) = A cos(kx)(−L/2 < x < L/2),


where A is the amplitude of the wave function and k = 2π/λ is its wave number. Beyond this interval, the
amplitude of the wave function is zero because the ball is confined to the tube. Requiring the wave function toterminate at the right end of the tube gives


Ψ⎛⎝x =
L
2
, 0⎞⎠ = 0.


Evaluating the wave function at x = L/2 gives
A cos(kL/2) = 0.


This equation is satisfied if the argument of the cosine is an integral multiple of π/2, 3π/2, 5π/2, and so on. In
this case, we have


kL
2


= π
2
,


or
k = π


L
.


Applying the normalization condition gives A = 2/L , so the wave function of the ball is
Ψ(x, 0) = 2


L
cos(πx/L), − L/2 < x < L/2.


306 Chapter 7 | Quantum Mechanics


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




To determine the probability of finding the ball in the last quarter of the tube, we square the function and integrate:
P(x = L/4, L/2) = ⌠


⌡L/4


L/2


| 2L cos⎛⎝πxL ⎞⎠|
2
dx = 0.091.


Significance
The probability of finding the ball in the last quarter of the tube is 9.1%. The ball has a definite wavelength
(λ = 2L) . If the tube is of macroscopic length (L = 1m) , the momentum of the ball is


p = h
λ
= h


2L
~10−36m/s.


This momentum is much too small to be measured by any human instrument.


An Interpretation of the Wave Function
We are now in position to begin to answer the questions posed at the beginning of this section. First, for a traveling particledescribed by Ψ(x, t) = A sin(kx − ωt) , what is “waving?” Based on the above discussion, the answer is a mathematical
function that can, among other things, be used to determine where the particle is likely to be when a position measurementis performed. Second, how is the wave function used to make predictions? If it is necessary to find the probability that aparticle will be found in a certain interval, square the wave function and integrate over the interval of interest. Soon, youwill learn soon that the wave function can be used to make many other kinds of predictions, as well.
Third, if a matter wave is given by the wave function Ψ(x, t) , where exactly is the particle? Two answers exist: (1) when
the observer is not looking (or the particle is not being otherwise detected), the particle is everywhere (x = −∞, +∞) ;
and (2) when the observer is looking (the particle is being detected), the particle “jumps into” a particular position state
(x, x + dx) with a probability given by P(x, x + dx) = |Ψ(x, t)|2dx —a process called state reduction or wave function
collapse. This answer is called the Copenhagen interpretation of the wave function, or of quantum mechanics.
To illustrate this interpretation, consider the simple case of a particle that can occupy a small container either at x1 or x2
(Figure 7.6). In classical physics, we assume the particle is located either at x1 or x2 when the observer is not looking.
However, in quantum mechanics, the particle may exist in a state of indefinite position—that is, it may be located at x1
and x2 when the observer is not looking. The assumption that a particle can only have one value of position (when the
observer is not looking) is abandoned. Similar comments can be made of other measurable quantities, such as momentumand energy.


Figure 7.6 A two-state system of position of a particle.


The bizarre consequences of the Copenhagen interpretation of quantum mechanics are illustrated by a creative thoughtexperiment first articulated by Erwin Schrödinger (National Geographic, 2013) (Figure 7.7):
“A cat is placed in a steel box along with a Geiger counter, a vial of poison, a hammer, and a radioactive substance. Whenthe radioactive substance decays, the Geiger detects it and triggers the hammer to release the poison, which subsequentlykills the cat. The radioactive decay is a random [probabilistic] process, and there is no way to predict when it will happen.Physicists say the atom exists in a state known as a superposition—both decayed and not decayed at the same time. Untilthe box is opened, an observer doesn’t know whether the cat is alive or dead—because the cat’s fate is intrinsically tied towhether or not the atom has decayed and the cat would [according to the Copenhagen interpretation] be “living and dead ...in equal parts” until it is observed.”


Chapter 7 | Quantum Mechanics 307




7.1


Figure 7.7 Schrödinger’s cat.


Schrödinger took the absurd implications of this thought experiment (a cat simultaneously dead and alive) as an argumentagainst the Copenhagen interpretation. However, this interpretation remains the most commonly taught view of quantummechanics.
Two-state systems (left and right, atom decays and does not decay, and so on) are often used to illustrate the principles ofquantum mechanics. These systems find many applications in nature, including electron spin and mixed states of particles,atoms, and even molecules. Two-state systems are also finding application in the quantum computer, as mentioned in theintroduction of this chapter. Unlike a digital computer, which encodes information in binary digits (zeroes and ones), aquantum computer stores and manipulates data in the form of quantum bits, or qubits. In general, a qubit is not in a state ofzero or one, but rather in a mixed state of zero and one. If a large number of qubits are placed in the same quantum state,the measurement of an individual qubit would produce a zero with a probability p, and a one with a probability q = 1 − p.
Many scientists believe that quantum computers are the future of the computer industry.
Complex Conjugates
Later in this section, you will see how to use the wave function to describe particles that are “free” or bound by forcesto other particles. The specific form of the wave function depends on the details of the physical system. A peculiarity ofquantum theory is that these functions are usually complex functions. A complex function is one that contains one or
more imaginary numbers (i = −1) . Experimental measurements produce real (nonimaginary) numbers only, so the above
procedure to use the wave function must be slightly modified. In general, the probability that a particle is found in thenarrow interval (x, x + dx) at time t is given by


(7.5)P(x, x + dx) = |Ψ(x, t)|2dx = Ψ* (x, t)Ψ(x, t)dx,


where Ψ* (x, t) is the complex conjugate of the wave function. The complex conjugate of a function is obtaining by
replacing every occurrence of i = −1 in that function with −i . This procedure eliminates complex numbers in all
predictions because the product Ψ* (x, t)Ψ(x, t) is always a real number.


Check Your Understanding If a = 3 + 4i , what is the product a* a ?


308 Chapter 7 | Quantum Mechanics


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




7.2


Consider the motion of a free particle that moves along the x-direction. As the name suggests, a free particle experiences noforces and so moves with a constant velocity. As we will see in a later section of this chapter, a formal quantum mechanicaltreatment of a free particle indicates that its wave function has real and complex parts. In particular, the wave function isgiven by
Ψ(x, t) = A cos(kx − ωt) + iA sin(kx − ωt),


where A is the amplitude, k is the wave number, and ω is the angular frequency. Using Euler’s formula,
e


= cos(ϕ) + i sin⎛⎝ϕ⎞⎠, this equation can be written in the form
Ψ(x, t) = Aei(kx − ωt) = Ae



,


where ϕ is the phase angle. If the wave function varies slowly over the interval Δx, the probability of finding the particle
in that interval is


P(x, x + Δx) ≈ Ψ* (x, t)Ψ(x, t)Δx = ⎛⎝Ae
iϕ⎞


⎝A* e


−iϕ⎞
⎠Δx = (A* A)Δx.


If A has real and complex parts (a + ib , where a and b are real constants), then
A* A = (a + ib)(a − ib) = a2 + b2.


Notice that the complex numbers have vanished. Thus,
P(x, x + Δx) ≈ |A|2Δx


is a real quantity. The interpretation of Ψ* (x, t)Ψ(x, t) as a probability density ensures that the predictions of quantum
mechanics can be checked in the “real world.”


Check Your Understanding Suppose that a particle with energy E is moving along the x-axis and isconfined in the region between 0 and L. One possible wave function is
ψ(x, t) =






Ae−iEt/ℏ sin πx


L
, when 0 ≤ x ≤ L


0, otherwise
.


Determine the normalization constant.


Expectation Values
In classical mechanics, the solution to an equation of motion is a function of a measurable quantity, such as x(t), where x isthe position and t is the time. Note that the particle has one value of position for any time t. In quantum mechanics, however,the solution to an equation of motion is a wave function, Ψ(x, t). The particle has many values of position for any time t,
and only the probability density of finding the particle, |Ψ(x, t)|2 , can be known. The average value of position for a large
number of particles with the same wave function is expected to be


(7.6)
〈 x 〉 = ∫


−∞




xP(x, t)dx = ∫
−∞




xΨ* (x, t)Ψ(x, t)dx.


This is called the expectation value of the position. It is usually written


(7.7)
〈 x 〉 = ∫


−∞




Ψ* (x, t)xΨ(x, t)dx,


where the x is sandwiched between the wave functions. The reason for this will become apparent soon. Formally, x is calledthe position operator.
At this point, it is important to stress that a wave function can be written in terms of other quantities as well, such as velocity


Chapter 7 | Quantum Mechanics 309




(v), momentum (p), and kinetic energy (K). The expectation value of momentum, for example, can be written
(7.8)


〈 p 〉 = ∫
−∞




Ψ* (p, t)pΨ(p, t)dp,


Where dp is used instead of dx to indicate an infinitesimal interval in momentum. In some cases, we know the wave functionin position, Ψ(x, t), but seek the expectation of momentum. The procedure for doing this is
(7.9)


〈 p 〉 = ⌠


−∞




Ψ* (x, t)⎛⎝−iℏ
d
dx

⎠Ψ(x, t)dx,


where the quantity in parentheses, sandwiched between the wave functions, is called the momentum operator in thex-direction. [The momentum operator in Equation 7.9 is said to be the position-space representation of the momentumoperator.] The momentum operator must act (operate) on the wave function to the right, and then the result must bemultiplied by the complex conjugate of the wave function on the left, before integration. The momentum operator in thex-direction is sometimes denoted
(7.10)(px)op = −iℏ ddx,


Momentum operators for the y- and z-directions are defined similarly. This operator and many others are derived in a moreadvanced course in modern physics. In some cases, this derivation is relatively simple. For example, the kinetic energyoperator is just
(7.11)


(K)op = 12
m(vx)op


2 =
(px)op


2


2m
=

⎝−iℏ


d
dx



2


2m
= −ℏ


2


2m


d
dx




d
dx

⎠.


Thus, if we seek an expectation value of kinetic energy of a particle in one dimension, two successive ordinary derivativesof the wave function are required before integration.
Expectation-value calculations are often simplified by exploiting the symmetry of wave functions. Symmetric wavefunctions can be even or odd. An even function is a function that satisfies


(7.12)ψ(x) = ψ( − x).
In contrast, an odd function is a function that satisfies


(7.13)ψ(x) = - ψ( − x).
An example of even and odd functions is shown in Figure 7.8. An even function is symmetric about the y-axis. Thisfunction is produced by reflecting ψ(x) for x > 0 about the vertical y-axis. By comparison, an odd function is generated by
reflecting the function about the y-axis and then about the x-axis. (An odd function is also referred to as an anti-symmetricfunction.)


Figure 7.8 Examples of even and odd wave functions.


In general, an even function times an even function produces an even function. A simple example of an even function is the
product x2 e−x2 (even times even is even). Similarly, an odd function times an odd function produces an even function,
such as x sin x (odd times odd is even). However, an odd function times an even function produces an odd function, such as
xe−x


2 (odd times even is odd). The integral over all space of an odd function is zero, because the total area of the function
above the x-axis cancels the (negative) area below it. As the next example shows, this property of odd functions is very


310 Chapter 7 | Quantum Mechanics


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




useful.
Example 7.3


Expectation Value (Part I)
The normalized wave function of a particle is


ψ(x) = e
−|x|/x0/ x0.


Find the expectation value of position.
Strategy
Substitute the wave function into Equation 7.7 and evaluate. The position operator introduces a multiplicativefactor only, so the position operator need not be “sandwiched.”
Solution
First multiply, then integrate:


〈 x 〉 = ∫
−∞


+∞


dxx|ψ(x)|  2 = ⌠

−∞


+∞


dxx|e−|x|/x0x0 |
 2


= 1x0 ∫−∞


+∞


dxxe
−2|x|/x0 = 0.


Significance
The function in the integrand (xe−2|x|/x0) is odd since it is the product of an odd function (x) and an even function
(e


−2|x|/x0) . The integral vanishes because the total area of the function about the x-axis cancels the (negative)
area below it. The result ( 〈 x 〉 = 0) is not surprising since the probability density function is symmetric about
x = 0 .


Example 7.4
Expectation Value (Part II)
The time-dependent wave function of a particle confined to a region between 0 and L is


ψ(x, t) = Ae−iωt sin(πx/L)


where ω is angular frequency and E is the energy of the particle. (Note: The function varies as a sine because
of the limits (0 to L). When x = 0, the sine factor is zero and the wave function is zero, consistent with the
boundary conditions.) Calculate the expectation values of position, momentum, and kinetic energy.
Strategy
We must first normalize the wave function to find A. Then we use the operators to calculate the expectationvalues.
Solution
Computation of the normalization constant:


1 = ∫
0


L


dxψ* (x)ψ(x) = ⌠

0


L


dx⎛⎝Ae
+iωt sin πx


L



⎝Ae


−iωt sin πx
L

⎠ = A


2∫
0


L


dx sin2 πx
L


= A2 L
2


⇒ A = 2
L
.


The expectation value of position is


Chapter 7 | Quantum Mechanics 311




7.3


〈 x 〉 = ∫
0


L


dxψ* (x)xψ(x) = ⌠

0


L


dx⎛⎝Ae
+iωt sin πx


L

⎠x

⎝Ae


−iωt sin πx
L

⎠ = A


2⌠

0


L


dxx sin2 πx
L


= A2 L
2


4
= L


2
.


The expectation value of momentum in the x-direction also requires an integral. To set this integral up, theassociated operator must— by rule—act to the right on the wave function ψ(x) :
−iℏ d


dx
ψ(x) = −iℏ d


dx
Ae−iωt sin πx


L
= −iAh


2L
e−iωt cos πx


L
 .


Therefore, the expectation value of momentum is
〈 p 〉 = ⌠



0


L


dx⎛⎝Ae
+iωt sin πx


L



⎝−i


Ah
2L


e−iωt cos πx
L

⎠ = −i


A2h
4L ∫


0


L


dx sin 2πx
L


= 0.


The function in the integral is a sine function with a wavelength equal to the width of the well, L—an odd functionabout x = L/2 . As a result, the integral vanishes.
The expectation value of kinetic energy in the x-direction requires the associated operator to act on the wavefunction:


− ℏ
2


2m
d2


dx2
ψ(x) = − ℏ


2


2m
d2


dx2
Ae−iωt sin πx


L
= − ℏ


2


2m
Ae−iωt d


2


dx2
sin πx


L
= Ah


2


2mL2
e−iωt sin πx


L
.


Thus, the expectation value of the kinetic energy is


〈 K 〉 = ⌠

0


L


dx⎛⎝Ae
+iωt sin πx


L




Ah2


2mL2
e−iωt sin πx


L



= A
2h2


2mL2

0


L


dx sin2 πx
L


= A
2h2


2mL2
L
2
= h


2


2mL2
 .


Significance
The average position of a large number of particles in this state is L/2. The average momentum of these particlesis zero because a given particle is equally likely to be moving right or left. However, the particle is not at restbecause its average kinetic energy is not zero. Finally, the probability density is


|ψ |
 2
= (2/L)sin2(πx/L).


This probability density is largest at location L/2 and is zero at x = 0 and at x = L. Note that these conclusions
do not depend explicitly on time.


Check Your Understanding For the particle in the above example, find the probability of locating itbetween positions 0 and L/4


Quantum mechanics makes many surprising predictions. However, in 1920, Niels Bohr (founder of the Niels Bohr Institutein Copenhagen, from which we get the term “Copenhagen interpretation”) asserted that the predictions of quantummechanics and classical mechanics must agree for all macroscopic systems, such as orbiting planets, bouncing balls, rockingchairs, and springs. This correspondence principle is now generally accepted. It suggests the rules of classical mechanicsare an approximation of the rules of quantum mechanics for systems with very large energies. Quantum mechanics describesboth the microscopic and macroscopic world, but classical mechanics describes only the latter.


312 Chapter 7 | Quantum Mechanics


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




7.2 | The Heisenberg Uncertainty Principle
Learning Objectives


By the end of this section, you will be able to:
• Describe the physical meaning of the position-momentum uncertainty relation
• Explain the origins of the uncertainty principle in quantum theory
• Describe the physical meaning of the energy-time uncertainty relation


Heisenberg’s uncertainty principle is a key principle in quantum mechanics. Very roughly, it states that if we knoweverything about where a particle is located (the uncertainty of position is small), we know nothing about its momentum(the uncertainty of momentum is large), and vice versa. Versions of the uncertainty principle also exist for other quantitiesas well, such as energy and time. We discuss the momentum-position and energy-time uncertainty principles separately.
Momentum and Position
To illustrate the momentum-position uncertainty principle, consider a free particle that moves along the x-direction. Theparticle moves with a constant velocity u and momentum p = mu . According to de Broglie’s relations, p = ℏk and
E = ℏω . As discussed in the previous section, the wave function for this particle is given by


(7.14)ψk(x, t) = A[cos(ω t − k x) − i sin(ω t − k x)] = Ae−i(ω t − k x) = Ae−i ω t ei k x
and the probability density |ψk(x, t)|  2 = A2 is uniform and independent of time. The particle is equally likely to be
found anywhere along the x-axis but has definite values of wavelength and wave number, and therefore momentum. Theuncertainty of position is infinite (we are completely uncertain about position) and the uncertainty of the momentum is zero(we are completely certain about momentum). This account of a free particle is consistent with Heisenberg’s uncertaintyprinciple.
Similar statements can be made of localized particles. In quantum theory, a localized particle is modeled by a linearsuperposition of free-particle (or plane-wave) states called a wave packet. An example of a wave packet is shown inFigure 7.9. A wave packet contains many wavelengths and therefore by de Broglie’s relations many momenta—possible inquantum mechanics! This particle also has many values of position, although the particle is confined mostly to the interval
Δx . The particle can be better localized (Δx can be decreased) if more plane-wave states of different wavelengths or
momenta are added together in the right way (Δp is increased). According to Heisenberg, these uncertainties obey the
following relation.


The Heisenberg Uncertainty Principle
The product of the uncertainty in position of a particle and the uncertainty in its momentum can never be less thanone-half of the reduced Planck constant:


(7.15)Δx Δp ≥ ℏ/2.
This relation expresses Heisenberg’s uncertainty principle. It places limits on what we can know about a particle fromsimultaneous measurements of position and momentum. If Δx is large, Δp is small, and vice versa. Equation 7.15 can
be derived in a more advanced course in modern physics. Reflecting on this relation in his work The Physical Principles ofthe Quantum Theory, Heisenberg wrote “Any use of the words ‘position’ and ‘velocity’ with accuracy exceeding that givenby [the relation] is just as meaningless as the use of words whose sense is not defined.”


Chapter 7 | Quantum Mechanics 313




Figure 7.9 Adding together several plane waves of differentwavelengths can produce a wave that is relatively localized.


Note that the uncertainty principle has nothing to do with the precision of an experimental apparatus. Even for perfectmeasuring devices, these uncertainties would remain because they originate in the wave-like nature of matter. The precisevalue of the product ΔxΔp depends on the specific form of the wave function. Interestingly, the Gaussian function (or
bell-curve distribution) gives the minimum value of the uncertainty product: Δx Δp = ℏ/2.
Example 7.5


The Uncertainty Principle Large and Small
Determine the minimum uncertainties in the positions of the following objects if their speeds are known with a
precision of 1.0 × 10−3m/s : (a) an electron and (b) a bowling ball of mass 6.0 kg.
Strategy
Given the uncertainty in speed Δu = 1.0 × 10−3m/s , we have to first determine the uncertainty in momentum
Δp = m Δu and then invert Equation 7.15 to find the uncertainty in position Δx = ℏ/(2Δp) .
Solutiona. For the electron:


Δp = mΔu = (9.1 × 10−31 kg)(1.0 × 10−3m/s) = 9.1 × 10−34 kg ·m/s ,
Δx = ℏ


2 Δp = 5.8 cm.
b. For the bowling ball:


Δp = mΔu = (6.0 kg)(1.0 × 10−3m/s) = 6.0 × 10−3 kg ·m/s ,
Δx = ℏ


2 Δp = 8.8 × 10
−33m .


Significance
Unlike the position uncertainty for the electron, the position uncertainty for the bowling ball is immeasurablysmall. Planck’s constant is very small, so the limitations imposed by the uncertainty principle are not noticeablein macroscopic systems such as a bowling ball.


314 Chapter 7 | Quantum Mechanics


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Example 7.6
Uncertainty and the Hydrogen Atom
Estimate the ground-state energy of a hydrogen atom using Heisenberg’s uncertainty principle. (Hint: Accordingto early experiments, the size of a hydrogen atom is approximately 0.1 nm.)
Strategy
An electron bound to a hydrogen atom can be modeled by a particle bound to a one-dimensional box of length
L = 0.1 nm. The ground-state wave function of this system is a half wave, like that given in Example 7.1.
This is the largest wavelength that can “fit” in the box, so the wave function corresponds to the lowest energystate. Note that this function is very similar in shape to a Gaussian (bell curve) function. We can take the averageenergy of a particle described by this function (E) as a good estimate of the ground state energy (E0) . This
average energy of a particle is related to its average of the momentum squared, which is related to its momentumuncertainty.
Solution
To solve this problem, we must be specific about what is meant by “uncertainty of position” and “uncertainty ofmomentum.” We identify the uncertainty of position (Δx) with the standard deviation of position (σx) , and the
uncertainty of momentum (Δp) with the standard deviation of momentum (σ p) . For the Gaussian function, the
uncertainty product is


σx σ p = ℏ2
,


where
σx
2 = x2 − x– 2 and σ p


2 = p2 − p2.


The particle is equally likely to be moving left as moving right, so p– = 0 . Also, the uncertainty of position is
comparable to the size of the box, so σx = L. The estimated ground state energy is therefore


E0 = EGaussian =
p2


m =
σ p
2


2m
= 1


2m



2σx



2


= 1
2m



2L



2
= ℏ


2


8mL2
.


Multiplying numerator and denominator by c2 gives
E0 =


(ℏc)2


8(mc2)L2
= (197.3 eV · nm)


2


8⎛⎝0.511 · 10
6 eV⎞⎠(0.1 nm)


2
= 0.952 eV ≈ 1 eV.


Significance
Based on early estimates of the size of a hydrogen atom and the uncertainty principle, the ground-state energyof a hydrogen atom is in the eV range. The ionization energy of an electron in the ground-state energy isapproximately 10 eV, so this prediction is roughly confirmed. (Note: The product ℏc is often a useful value in
performing calculations in quantum mechanics.)


Energy and Time
Another kind of uncertainty principle concerns uncertainties in simultaneous measurements of the energy of a quantum stateand its lifetime,


(7.16)ΔEΔt ≥ ℏ
2
,


where ΔE is the uncertainty in the energy measurement and Δt is the uncertainty in the lifetime measurement. The


Chapter 7 | Quantum Mechanics 315




7.4


energy-time uncertainty principle does not result from a relation of the type expressed by Equation 7.15 for technicalreasons beyond this discussion. Nevertheless, the general meaning of the energy-time principle is that a quantum state thatexists for only a short time cannot have a definite energy. The reason is that the frequency of a state is inversely proportionalto time and the frequency connects with the energy of the state, so to measure the energy with good precision, the state mustbe observed for many cycles.
To illustrate, consider the excited states of an atom. The finite lifetimes of these states can be deduced from the shapesof spectral lines observed in atomic emission spectra. Each time an excited state decays, the emitted energy is slightlydifferent and, therefore, the emission line is characterized by a distribution of spectral frequencies (or wavelengths) of theemitted photons. As a result, all spectral lines are characterized by spectral widths. The average energy of the emitted photoncorresponds to the theoretical energy of the excited state and gives the spectral location of the peak of the emission line.Short-lived states have broad spectral widths and long-lived states have narrow spectral widths.
Example 7.7


Atomic Transitions
An atom typically exists in an excited state for about Δt = 10−8 s . Estimate the uncertainty Δ  f in the frequency
of emitted photons when an atom makes a transition from an excited state with the simultaneous emission of a
photon with an average frequency of f = 7.1 × 1014 Hz . Is the emitted radiation monochromatic?
Strategy
We invert Equation 7.16 to obtain the energy uncertainty ΔE ≈ ℏ/2Δt and combine it with the photon energy
E = h  f to obtain Δ  f . To estimate whether or not the emission is monochromatic, we evaluate Δ f / f .
Solution
The spread in photon energies is Δ E = hΔ  f . Therefore,


ΔE ≈ ℏ
2Δt


⇒ hΔ f ≈ ℏ
2Δt


⇒ Δ f ≈ 1
4πΔt


= 1
4π(10−8 s)


= 8.0 × 106 Hz,


Δ f
f


= 8.0 × 10
6 Hz


7.1 × 1014 Hz
= 1.1 × 10−8.


Significance
Because the emitted photons have their frequencies within 1.1 × 10−6 percent of the average frequency, the
emitted radiation can be considered monochromatic.


Check Your Understanding A sodium atom makes a transition from the first excited state to the ground
state, emitting a 589.0-nm photon with energy 2.105 eV. If the lifetime of this excited state is 1.6 × 10−8 s ,
what is the uncertainty in energy of this excited state? What is the width of the corresponding spectral line?


7.3 | The Schrӧdinger Equation
Learning Objectives


By the end of this section, you will be able to:
• Describe the role Schrӧdinger’s equation plays in quantum mechanics
• Explain the difference between time-dependent and -independent Schrӧdinger’s equations
• Interpret the solutions of Schrӧdinger’s equation


In the preceding two sections, we described how to use a quantum mechanical wave function and discussed Heisenberg’suncertainty principle. In this section, we present a complete and formal theory of quantum mechanics that can be used to


316 Chapter 7 | Quantum Mechanics


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




make predictions. In developing this theory, it is helpful to review the wave theory of light. For a light wave, the electricfield E(x,t) obeys the relation
(7.17)∂2E


∂ x2
= 1


c2
∂2E
∂ t2


,


where c is the speed of light and the symbol ∂ represents a partial derivative. (Recall fromOscillations (http://cnx.org/
content/m58360/latest/) that a partial derivative is closely related to an ordinary derivative, but involves functions ofmore than one variable. When taking the partial derivative of a function by a certain variable, all other variables are held
constant.) A light wave consists of a very large number of photons, so the quantity |E(x, t)|2 can interpreted as a probability
density of finding a single photon at a particular point in space (for example, on a viewing screen).
There are many solutions to this equation. One solution of particular importance is


(7.18)E(x, t) = A sin(kx − ωt),
where A is the amplitude of the electric field, k is the wave number, and ω is the angular frequency. Combing this equation
with Equation 7.17 gives


(7.19)
k2 = ω


2


c2
.


According to de Broglie’s equations, we have p = ℏk and E = ℏω . Substituting these equations in Equation 7.19 gives
(7.20)p = Ec ,


or
(7.21)E = pc.


Therefore, according to Einstein’s general energy-momentum equation (Equation 5.11), Equation 7.17 describes aparticle with a zero rest mass. This is consistent with our knowledge of a photon.
This process can be reversed. We can begin with the energy-momentum equation of a particle and then ask what waveequation corresponds to it. The energy-momentum equation of a nonrelativistic particle in one dimension is


(7.22)
E =


p2


2m
+ U(x, t),


where p is the momentum, m is the mass, and U is the potential energy of the particle. The wave equation that goes with itturns out to be a key equation in quantum mechanics, called Schrӧdinger’s time-dependent equation.
The Schrӧdinger Time-Dependent Equation
The equation describing the energy and momentum of a wave function is known as the Schrӧdinger equation:


(7.23)
− ℏ


2


2m
∂2Ψ(x, t)


∂ x2
+ U(x, t)Ψ(x, t) = iℏ∂Ψ(x, t)


∂ t
.


As described in Potential Energy and Conservation of Energy (http://cnx.org/content/m58311/latest/) , theforce on the particle described by this equation is given by
(7.24)F = − ∂U(x, t)


∂ x
.


This equation plays a role in quantum mechanics similar to Newton’s second law in classical mechanics. Once thepotential energy of a particle is specified—or, equivalently, once the force on the particle is specified—we can solve thisdifferential equation for the wave function. The solution to Newton’s second law equation (also a differential equation) inone dimension is a function x(t) that specifies where an object is at any time t. The solution to Schrӧdinger’s time-dependentequation provides a tool—the wave function—that can be used to determine where the particle is likely to be. This equationcan be also written in two or three dimensions. Solving Schrӧdinger’s time-dependent equation often requires the aid of acomputer.
Consider the special case of a free particle. A free particle experiences no force (F = 0). Based on Equation 7.24, this


Chapter 7 | Quantum Mechanics 317




7.5


requires only that
(7.25)U(x, t) = U0 = constant.


For simplicity, we set U0 = 0 . Schrӧdinger’s equation then reduces to
(7.26)


− ℏ
2


2m
∂2Ψ(x, t)


∂ x2
= iℏ∂Ψ(x, t)


∂ t
.


A valid solution to this equation is
(7.27)Ψ(x, t) = Aei(kx − ωt).


Not surprisingly, this solution contains an imaginary number (i = −1) because the differential equation itself contains
an imaginary number. As stressed before, however, quantum-mechanical predictions depend only on |Ψ(x, t)|2 , which
yields completely real values. Notice that the real plane-wave solutions, Ψ(x, t) = A sin(kx − ωt) and
Ψ(x, t) = A cos(kx − ωt), do not obey Schrödinger’s equation. The temptation to think that a wave function can be seen,
touched, and felt in nature is eliminated by the appearance of an imaginary number. In Schrӧdinger’s theory of quantummechanics, the wave function is merely a tool for calculating things.
If the potential energy function (U) does not depend on time, it is possible to show that


(7.28)Ψ(x, t) = ψ(x)e−iωt


satisfies Schrӧdinger’s time-dependent equation, where ψ(x) is a time-independent function and e−iωt is a
space-independent function. In other words, the wave function is separable into two parts: a space-only part and a time-
only part. The factor e−iωt is sometimes referred to as a time-modulation factor since it modifies the space-only function.
According to de Broglie, the energy of a matter wave is given by E = ℏω , where E is its total energy. Thus, the above
equation can also be written as


(7.29)Ψ(x, t) = ψ(x)e−iEt/ℏ.
Any linear combination of such states (mixed state of energy or momentum) is also valid solution to this equation. Suchstates can, for example, describe a localized particle (see Figure 7.9)


Check Your Understanding A particle with mass m is moving along the x-axis in a potential given by
the potential energy function U(x) = 0.5mω2 x2 . Compute the product Ψ(x, t)* U(x)Ψ(x, t). Express your
answer in terms of the time-independent wave function, ψ(x).


Combining Equation 7.23 and Equation 7.28, Schrödinger’s time-dependent equation reduces to


(7.30)
− ℏ


2


2m
d2ψ(x)


dx2
+ U(x)ψ(x) = Eψ(x),


where E is the total energy of the particle (a real number). This equation is called Schrӧdinger’s time-independentequation. Notice that we use “big psi” (Ψ) for the time-dependent wave function and “little psi” (ψ) for the time-
independent wave function. The wave-function solution to this equation must be multiplied by the time-modulation factorto obtain the time-dependent wave function.
In the next sections, we solve Schrӧdinger’s time-independent equation for three cases: a quantum particle in a box, asimple harmonic oscillator, and a quantum barrier. These cases provide important lessons that can be used to solve more


318 Chapter 7 | Quantum Mechanics


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




7.6


complicated systems. The time-independent wave function ψ(x) solutions must satisfy three conditions:
• ψ(x) must be a continuous function.
• The first derivative of ψ(x) with respect to space, dψ(x)/dx , must be continuous, unless V(x) = ∞ .
• ψ(x) must not diverge (“blow up”) at x = ±∞.


The first condition avoids sudden jumps or gaps in the wave function. The second condition requires the wave function tobe smooth at all points, except in special cases. (In a more advanced course on quantum mechanics, for example, potentialspikes of infinite depth and height are used to model solids). The third condition requires the wave function be normalizable.
This third condition follows from Born’s interpretation of quantum mechanics. It ensures that |ψ(x)|2 is a finite number so
we can use it to calculate probabilities.


Check Your Understanding Which of the following wave functions is a valid wave-function solutionfor Schrӧdinger’s equation?


7.4 | The Quantum Particle in a Box
Learning Objectives


By the end of this section, you will be able to:
• Describe how to set up a boundary-value problem for the stationary Schrӧdinger equation
• Explain why the energy of a quantum particle in a box is quantized
• Describe the physical meaning of stationary solutions to Schrӧdinger’s equation and theconnection of these solutions with time-dependent quantum states
• Explain the physical meaning of Bohr’s correspondence principle


In this section, we apply Schrӧdinger’s equation to a particle bound to a one-dimensional box. This special case provideslessons for understanding quantum mechanics in more complex systems. The energy of the particle is quantized as aconsequence of a standing wave condition inside the box.
Consider a particle of mass m that is allowed to move only along the x-direction and its motion is confined to the region
between hard and rigid walls located at x = 0 and at x = L (Figure 7.10). Between the walls, the particle moves freely.
This physical situation is called the infinite square well, described by the potential energy function


(7.31)
U(x) =






0, 0 ≤ x ≤ L,
∞, otherwise.


Combining this equation with Schrӧdinger’s time-independent wave equation gives


(7.32)−ℏ2
2m


d2ψ(x)


dx2
= Eψ(x), for 0 ≤ x ≤ L


where E is the total energy of the particle. What types of solutions do we expect? The energy of the particle is a positivenumber, so if the value of the wave function is positive (right side of the equation), the curvature of the wave function isnegative, or concave down (left side of the equation). Similarly, if the value of the wave function is negative (right side of


Chapter 7 | Quantum Mechanics 319




the equation), the curvature of the wave function is positive or concave up (left side of equation). This condition is met byan oscillating wave function, such as a sine or cosine wave. Since these waves are confined to the box, we envision standingwaves with fixed endpoints at x = 0 and x = L .


Figure 7.10 The potential energy function that confines theparticle in a one-dimensional box.


Solutions ψ(x) to this equation have a probabilistic interpretation. In particular, the square |ψ(x)|2 represents the
probability density of finding the particle at a particular location x. This function must be integrated to determine theprobability of finding the particle in some interval of space. We are therefore looking for a normalizable solution thatsatisfies the following normalization condition:


(7.33)

0


L


dx|ψ(x)|2 = 1.


The walls are rigid and impenetrable, which means that the particle is never found beyond the wall. Mathematically, thismeans that the solution must vanish at the walls:
(7.34)ψ(0) = ψ(L) = 0.


We expect oscillating solutions, so the most general solution to this equation is
(7.35)ψk(x) = Ak cos kx + Bk sin kx


where k is the wave number, and Ak and Bk are constants. Applying the boundary condition expressed by Equation 7.34
gives


(7.36)ψk(0) = Ak cos(k · 0) + Bk sin(k · 0) = Ak = 0.
Because we have Ak = 0 , the solution must be


(7.37)ψk(x) = Bk sin kx.
If Bk is zero, ψk (x) = 0 for all values of x and the normalization condition, Equation 7.33, cannot be satisfied.
Assuming Bk ≠ 0 , Equation 7.34 for x = L then gives


(7.38)0 = Bk sin(kL) ⇒ sin(kL) = 0 ⇒ kL = nπ, n = 1, 2, 3,...
We discard the n = 0 solution because ψ(x) for this quantum number would be zero everywhere—an un-normalizable
and therefore unphysical solution. Substituting Equation 7.37 into Equation 7.32 gives


(7.39)
− ℏ


2


2m
d2


dx2

⎝Bk sin(kx)



⎠ = E⎛⎝Bk sin(kx)



⎠.


Computing these derivatives leads to
(7.40)


E = Ek =

2 k2
2m


.


320 Chapter 7 | Quantum Mechanics


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




According to de Broglie, p = ℏk, so this expression implies that the total energy is equal to the kinetic energy, consistent
with our assumption that the “particle moves freely.” Combining the results of Equation 7.38 and Equation 7.40 gives


(7.41)
En = n


2 π2ℏ2


2mL2
, n = 1, 2, 3, ...


Strange! A particle bound to a one-dimensional box can only have certain discrete (quantized) values of energy. Further, theparticle cannot have a zero kinetic energy—it is impossible for a particle bound to a box to be “at rest.”
To evaluate the allowed wave functions that correspond to these energies, we must find the normalization constant Bn . We
impose the normalization condition Equation 7.33 on the wave function


(7.42)ψn(x) = Bn sinnπx/L
1 = ∫


0


L


dx|ψn(x)|2 = ∫
0


L


dxBn
2 sin2 nπ


L
x = Bn


2∫
0


L


dx sin2 nπ
L
x = Bn


2 L
2
⇒ Bn = 2L


.


Hence, the wave functions that correspond to the energy values given in Equation 7.41 are


(7.43)ψn(x) = 2L sin nπxL , n = 1, 2, 3, ...


For the lowest energy state or ground state energy, we have
(7.44)


E1 =
π2ℏ2


2mL2
, ψ1(x) =


2
L
sin⎛⎝


πx
L

⎠.


All other energy states can be expressed as
(7.45)En = n2E1, ψn(x) = 2L sin⎛⎝nπxL ⎞⎠.


The index n is called the energy quantum number or principal quantum number. The state for n = 2 is the first excited
state, the state for n = 3 is the second excited state, and so on. The first three quantum states (for n = 1, 2, and 3) of a
particle in a box are shown in Figure 7.11.
The wave functions in Equation 7.45 are sometimes referred to as the “states of definite energy.” Particles in these statesare said to occupy energy levels, which are represented by the horizontal lines in Figure 7.11. Energy levels are analogousto rungs of a ladder that the particle can “climb” as it gains or loses energy.
The wave functions in Equation 7.45 are also called stationary states and standing wave states. These functions are
“stationary,” because their probability density functions, |Ψ(x, t)|2 , do not vary in time, and “standing waves” because
their real and imaginary parts oscillate up and down like a standing wave—like a rope waving between two children on aplayground. Stationary states are states of definite energy [Equation 7.45], but linear combinations of these states, suchas ψ(x) = aψ1 + bψ2 (also solutions to Schrӧdinger’s equation) are states of mixed energy.


Chapter 7 | Quantum Mechanics 321




Figure 7.11 The first three quantum states of a quantum particle in a box for principal quantumnumbers n = 1, 2, and 3 : (a) standing wave solutions and (b) allowed energy states.


Energy quantization is a consequence of the boundary conditions. If the particle is not confined to a box but wanders freely,the allowed energies are continuous. However, in this case, only certain energies (E1, 4E1, 9E1, …) are allowed. The
energy difference between adjacent energy levels is given by


(7.46)ΔEn + 1, n = En + 1 − En = (n + 1)2E1 − n2E1 = (2n + 1)E1.
Conservation of energy demands that if the energy of the system changes, the energy difference is carried in some otherform of energy. For the special case of a charged particle confined to a small volume (for example, in an atom), energychanges are often carried away by photons. The frequencies of the emitted photons give us information about the energydifferences (spacings) of the system and the volume of containment—the size of the “box” [see Equation 7.44].
Example 7.8


A Simple Model of the Nucleus
Suppose a proton is confined to a box of width L = 1.00 × 10−14m (a typical nuclear radius). What are the
energies of the ground and the first excited states? If the proton makes a transition from the first excited state tothe ground state, what are the energy and the frequency of the emitted photon?
Strategy
If we assume that the proton confined in the nucleus can be modeled as a quantum particle in a box, all we need
to do is to use Equation 7.41 to find its energies E1 and E2 . The mass of a proton is m = 1.76 × 10−27 kg.
The emitted photon carries away the energy difference ΔE = E2 − E1. We can use the relation E f = h f to find
its frequency f.
Solution
The ground state:


E1 =
π2ℏ2


2m L2
= π


2 (1.05 × 10−34 J · s)2


2(1.67 × 10−27 kg) (1.00 × 10−14m)2
= 3.28 × 10−13 J = 2.05MeV.


322 Chapter 7 | Quantum Mechanics


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




The first excited state: E2 = 22E1 = 4(2.05MeV) = 8.20MeV .
The energy of the emitted photon is E f = ΔE = E2 − E1 = 8.20MeV − 2.05MeV = 6.15MeV .
The frequency of the emitted photon is


f =
E f
h


= 6.15MeV
4.14 × 10−21MeV · s


= 1.49 × 1021 Hz.


Significance
This is the typical frequency of a gamma ray emitted by a nucleus. The energy of this photon is about 10 milliontimes greater than that of a visible light photon.


The expectation value of the position for a particle in a box is given by
(7.47)


〈 x 〉 = ∫
0


L


dxψn* (x)xψn(x) = ∫
0


L


dxx|ψn* (x)|2 = ∫
0


L


dxx2
L
sin2 nπx


L
= L


2
.


We can also find the expectation value of the momentum or average momentum of a large number of particles in a givenstate:
(7.48)


〈 p 〉 = ⌠

0


L


dxψn* (x)

⎣−iℏ


d
dx


ψn(x)



= −iℏ⌠

0


L


dx 2
L
sin nπx


L


d
dx


2
L
sin nπx


L

⎦ = −i


2ℏ
L ∫


0


L


dx sin nπx
L



L


cos nπx
L



= −i2nπℏ
L2



0


L


dx1
2
sin 2nπx


L
= −inπℏ


L2
L


2nπ ∫
0


2πn


dφ sin φ = −i ℏ
2L


· 0 = 0.


Thus, for a particle in a state of definite energy, the average position is in the middle of the box and the average momentumof the particle is zero—as it would also be for a classical particle. Note that while the minimum energy of a classical particlecan be zero (the particle can be at rest in the middle of the box), the minimum energy of a quantum particle is nonzero andgiven by Equation 7.44. The average particle energy in the nth quantum state—its expectation value of energy—is
(7.49)


En = 〈 E 〉 = n
2 π2ℏ2


2m
.


The result is not surprising because the standing wave state is a state of definite energy. Any energy measurement of thissystem must return a value equal to one of these allowed energies.
Our analysis of the quantum particle in a box would not be complete without discussing Bohr’s correspondence principle.This principle states that for large quantum numbers, the laws of quantum physics must give identical results as the lawsof classical physics. To illustrate how this principle works for a quantum particle in a box, we plot the probability densitydistribution


(7.50)
|ψn(x)|


2
= 2


L
sin2(nπx/L)


for finding the particle around location x between the walls when the particle is in quantum state ψn . Figure 7.12 shows
these probability distributions for the ground state, for the first excited state, and for a highly excited state that correspondsto a large quantum number. We see from these plots that when a quantum particle is in the ground state, it is most likely to befound around the middle of the box, where the probability distribution has the largest value. This is not so when the particleis in the first excited state because now the probability distribution has the zero value in the middle of the box, so there isno chance of finding the particle there. When a quantum particle is in the first excited state, the probability distribution hastwo maxima, and the best chance of finding the particle is at positions close to the locations of these maxima. This quantumpicture is unlike the classical picture.


Chapter 7 | Quantum Mechanics 323




Figure 7.12 The probability density distribution |ψn(x)|2 for a quantum particle in a box for: (a)
the ground state, n = 1 ; (b) the first excited state, n = 2 ; and, (c) the nineteenth excited state,
n = 20 .


The probability density of finding a classical particle between x and x + Δx depends on how much time Δt the particle
spends in this region. Assuming that its speed u is constant, this time is Δt = Δx/u, which is also constant for any location
between the walls. Therefore, the probability density of finding the classical particle at x is uniform throughout the box, andthere is no preferable location for finding a classical particle. This classical picture is matched in the limit of large quantumnumbers. For example, when a quantum particle is in a highly excited state, shown in Figure 7.12, the probability densityis characterized by rapid fluctuations and then the probability of finding the quantum particle in the interval Δx does not
depend on where this interval is located between the walls.
Example 7.9


A Classical Particle in a Box
A small 0.40-kg cart is moving back and forth along an air track between two bumpers located 2.0 m apart.We assume no friction; collisions with the bumpers are perfectly elastic so that between the bumpers, the carmaintains a constant speed of 0.50 m/s. Treating the cart as a quantum particle, estimate the value of the principalquantum number that corresponds to its classical energy.
Strategy
We find the kinetic energy K of the cart and its ground state energy E1 as though it were a quantum particle. The
energy of the cart is completely kinetic, so K = n2E1 (Equation 7.45). Solving for n gives n = (K/E1)1/2 .
Solution
The kinetic energy of the cart is


K = 1
2
mu2 = 1


2
(0.40 kg)(0.50 m/s)2 = 0.050 J.


The ground state of the cart, treated as a quantum particle, is


324 Chapter 7 | Quantum Mechanics


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




7.7


E1 =
π2ℏ2


2mL2
= π


2 (1.05 × 10−34 J · s)2


2(0.40 kg)(2.0 m)2
= 1.700 × 10−68 J.


Therefore, n = (K/E1)1/2 = (0.050/1.700 × 10−68)1/2 = 1.2 × 1033 .
Significance
We see from this example that the energy of a classical system is characterized by a very large quantumnumber. Bohr’s correspondence principle concerns this kind of situation. We can apply the formalism of quantummechanics to any kind of system, quantum or classical, and the results are correct in each case. In the limit ofhigh quantum numbers, there is no advantage in using quantum formalism because we can obtain the same resultswith the less complicated formalism of classical mechanics. However, we cannot apply classical formalism to aquantum system in a low-number energy state.


Check Your Understanding (a) Consider an infinite square well with wall boundaries x = 0 and
x = L . What is the probability of finding a quantum particle in its ground state somewhere between x = 0 and
x = L/4 ? (b) Repeat question (a) for a classical particle.


Having found the stationary states ψn(x) and the energies En by solving the time-independent Schrӧdinger equation
Equation 7.32, we use Equation 7.28 to write wave functions Ψn(x, t) that are solutions of the time-dependent
Schrӧdinger’s equation given by Equation 7.23. For a particle in a box this gives


(7.51)Ψn(x, t) = e−iωn tψn(x) = 2Le−iEn t/ℏ sin nπxL , n = 1, 2, 3, ...
where the energies are given by Equation 7.41.
The quantum particle in a box model has practical applications in a relatively newly emerged field of optoelectronics,which deals with devices that convert electrical signals into optical signals. This model also deals with nanoscale physicalphenomena, such as a nanoparticle trapped in a low electric potential bounded by high-potential barriers.
7.5 | The Quantum Harmonic Oscillator


Learning Objectives
By the end of this section, you will be able to:
• Describe the model of the quantum harmonic oscillator
• Identify differences between the classical and quantum models of the harmonic oscillator
• Explain physical situations where the classical and the quantum models coincide


Oscillations are found throughout nature, in such things as electromagnetic waves, vibrating molecules, and the gentle back-and-forth sway of a tree branch. In previous chapters, we used Newtonian mechanics to study macroscopic oscillations,such as a block on a spring and a simple pendulum. In this chapter, we begin to study oscillating systems using quantummechanics. We begin with a review of the classic harmonic oscillator.
The Classic Harmonic Oscillator
A simple harmonic oscillator is a particle or system that undergoes harmonic motion about an equilibrium position, such asan object with mass vibrating on a spring. In this section, we consider oscillations in one-dimension only. Suppose a massmoves back-and-forth along the
x-direction about the equilibrium position, x = 0 . In classical mechanics, the particle moves in response to a linear
restoring force given by Fx = −kx, where x is the displacement of the particle from its equilibrium position. The motion
takes place between two turning points, x = ±A , where A denotes the amplitude of the motion. The position of the object
varies periodically in time with angular frequency ω = k/m, which depends on the mass m of the oscillator and on the


Chapter 7 | Quantum Mechanics 325




force constant k of the net force, and can be written as
(7.52)x(t) = A cos (ω  t + ϕ).


The total energy E of an oscillator is the sum of its kinetic energy K = mu2/2 and the elastic potential energy of the force
U(x) = k x2/2,


(7.53)E = 1
2
mu2 + 1


2
kx2.


At turning points x = ±A , the speed of the oscillator is zero; therefore, at these points, the energy of oscillation is solely
in the form of potential energy E = k A  2/2 . The plot of the potential energy U(x) of the oscillator versus its position x is a
parabola (Figure 7.13). The potential-energy function is a quadratic function of x, measured with respect to the equilibriumposition. On the same graph, we also plot the total energy E of the oscillator, as a horizontal line that intercepts the parabolaat x = ±A . Then the kinetic energy K is represented as the vertical distance between the line of total energy and the
potential energy parabola.


Figure 7.13 The potential energy well of a classical harmonic oscillator: The motionis confined between turning points at x = −A and at x = + A . The energy of
oscillations is E = kA2/2.


In this plot, the motion of a classical oscillator is confined to the region where its kinetic energy is nonnegative, which iswhat the energy relation Equation 7.53 says. Physically, it means that a classical oscillator can never be found beyondits turning points, and its energy depends only on how far the turning points are from its equilibrium position. The energyof a classical oscillator changes in a continuous way. The lowest energy that a classical oscillator may have is zero, whichcorresponds to a situation where an object is at rest at its equilibrium position. The zero-energy state of a classical oscillatorsimply means no oscillations and no motion at all (a classical particle sitting at the bottom of the potential well in Figure7.13). When an object oscillates, no matter how big or small its energy may be, it spends the longest time near the turningpoints, because this is where it slows down and reverses its direction of motion. Therefore, the probability of finding aclassical oscillator between the turning points is highest near the turning points and lowest at the equilibrium position. (Notethat this is not a statement of preference of the object to go to lower energy. It is a statement about how quickly the objectmoves through various regions.)
The Quantum Harmonic Oscillator
One problem with this classical formulation is that it is not general. We cannot use it, for example, to describe vibrations ofdiatomic molecules, where quantum effects are important. A first step toward a quantum formulation is to use the classical
expression k = m ω  2 to limit mention of a “spring” constant between the atoms. In this way the potential energy function
can be written in a more general form,


326 Chapter 7 | Quantum Mechanics


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




(7.54)U(x) = 1
2
mω  2 x  2.


Combining this expression with the time-independent Schrӧdinger equation gives


(7.55)
− ℏ
2m


d  2ψ(x)
d x2 +


1
2
mω  2 x  2ψ(x) = Eψ(x).


To solve Equation 7.55—that is, to find the allowed energies E and their corresponding wave functions ψ(x) —we
require the wave functions to be symmetric about x = 0 (the bottom of the potential well) and to be normalizable. These
conditions ensure that the probability density |ψ(x)|  2 must be finite when integrated over the entire range of x from −∞
to +∞ . How to solve Equation 7.55 is the subject of a more advanced course in quantum mechanics; here, we simply
cite the results. The allowed energies are


(7.56)En = ⎛⎝n + 12⎞⎠ℏω = 2n + 12 ℏω, n = 0, 1, 2, 3, ...


The wave functions that correspond to these energies (the stationary states or states of definite energy) are


(7.57)
ψn(x) = Nn e


 − β  2 x  2 /2
Hn(βx), n = 0, 1, 2, 3, ...


where β = m  ω/ℏ , Nn is the normalization constant, and Hn(y) is a polynomial of degree n called a Hermite
polynomial. The first four Hermite polynomials are


H0 (y) = 1


H1 (y) = 2y


H2 (y) = 4y
2 − 2


H3 (y) = 8y
3 − 12y.


A few sample wave functions are given in Figure 7.14. As the value of the principal number increases, the solutionsalternate between even functions and odd functions about x = 0 .


Chapter 7 | Quantum Mechanics 327




Figure 7.14 The first five wave functions of the quantum harmonicoscillator. The classical limits of the oscillator’s motion are indicated byvertical lines, corresponding to the classical turning points at x = ±A
of a classical particle with the same energy as the energy of a quantumoscillator in the state indicated in the figure.


Example 7.10
Classical Region of Harmonic Oscillations
Find the amplitude A of oscillations for a classical oscillator with energy equal to the energy of a quantumoscillator in the quantum state n.
Strategy
To determine the amplitude A, we set the classical energy E = kx2/2 = m ω2 A2/2 equal to En given by
Equation 7.56.
Solution
We obtain


En = m ω2 An 2/2 ⇒ An = 2
m ω  2En =


2
m ω  2


2n + 1
2


ℏω = (2n + 1) ℏm ω.


Significance
As the quantum number n increases, the energy of the oscillator and therefore the amplitude of oscillationincreases (for a fixed natural angular frequency. For large n, the amplitude is approximately proportional to thesquare root of the quantum number.


Several interesting features appear in this solution. Unlike a classical oscillator, the measured energies of a quantumoscillator can have only energy values given by Equation 7.56. Moreover, unlike the case for a quantum particle in a box,the allowable energy levels are evenly spaced,
(7.58)ΔE = En + 1 − En = 2(n + 1) + 12 ℏω − 2n + 12 ℏω = ℏω = h  f .


When a particle bound to such a system makes a transition from a higher-energy state to a lower-energy state, the smallest-energy quantum carried by the emitted photon is necessarily hf. Similarly, when the particle makes a transition from a lower-energy state to a higher-energy state, the smallest-energy quantum that can be absorbed by the particle is hf. A quantumoscillator can absorb or emit energy only in multiples of this smallest-energy quantum. This is consistent with Planck’s


328 Chapter 7 | Quantum Mechanics


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




7.8


hypothesis for the energy exchanges between radiation and the cavity walls in the blackbody radiation problem.
Example 7.11


Vibrational Energies of the Hydrogen Chloride Molecule
The HCl diatomic molecule consists of one chlorine atom and one hydrogen atom. Because the chlorine atomis 35 times more massive than the hydrogen atom, the vibrations of the HCl molecule can be quite wellapproximated by assuming that the Cl atom is motionless and the H atom performs harmonic oscillations due toan elastic molecular force modeled by Hooke’s law. The infrared vibrational spectrum measured for hydrogen
chloride has the lowest-frequency line centered at f = 8.88 × 10  13 Hz . What is the spacing between the
vibrational energies of this molecule? What is the force constant k of the atomic bond in the HCl molecule?
Strategy
The lowest-frequency line corresponds to the emission of lowest-frequency photons. These photons are emittedwhen the molecule makes a transition between two adjacent vibrational energy levels. Assuming that energylevels are equally spaced, we use Equation 7.58 to estimate the spacing. The molecule is well approximatedby treating the Cl atom as being infinitely heavy and the H atom as the mass m that performs the oscillations.Treating this molecular system as a classical oscillator, the force constant is found from the classical relation
k = m ω  2 .
Solution
The energy spacing is


ΔE = h  f = (4.14 × 10  − 15 eV · s)(8.88 × 10  13 Hz) = 0.368 eV.
The force constant is


k = m ω  2 = m (2π f )2 = (1.67 × 10  − 27 kg)(2π × 8.88 × 10  13 Hz)2 = 520 N/m.
Significance
The force between atoms in an HCl molecule is surprisingly strong. The typical energy released in energytransitions between vibrational levels is in the infrared range. As we will see later, transitions in betweenvibrational energy levels of a diatomic molecule often accompany transitions between rotational energy levels.


Check Your Understanding The vibrational frequency of the hydrogen iodide HI diatomic molecule is
6.69 × 10  13 Hz . (a) What is the force constant of the molecular bond between the hydrogen and the iodine
atoms? (b) What is the energy of the emitted photon when this molecule makes a transition between adjacentvibrational energy levels?


The quantum oscillator differs from the classic oscillator in three ways:
First, the ground state of a quantum oscillator is E0 = ℏω/2, not zero. In the classical view, the lowest energy is zero. The
nonexistence of a zero-energy state is common for all quantum-mechanical systems because of omnipresent fluctuationsthat are a consequence of the Heisenberg uncertainty principle. If a quantum particle sat motionless at the bottom of thepotential well, its momentum as well as its position would have to be simultaneously exact, which would violate theHeisenberg uncertainty principle. Therefore, the lowest-energy state must be characterized by uncertainties in momentumand in position, so the ground state of a quantum particle must lie above the bottom of the potential well.
Second, a particle in a quantum harmonic oscillator potential can be found with nonzero probability outside the interval
−A ≤ x ≤ + A . In a classic formulation of the problem, the particle would not have any energy to be in this region. The
probability of finding a ground-state quantum particle in the classically forbidden region is about 16%.
Third, the probability density distributions |ψn(x)|  2 for a quantum oscillator in the ground low-energy state, ψ0(x) , is
largest at the middle of the well (x = 0) . For the particle to be found with greatest probability at the center of the well, we
expect that the particle spends the most time there as it oscillates. This is opposite to the behavior of a classical oscillator,in which the particle spends most of its time moving with relative small speeds near the turning points.


Chapter 7 | Quantum Mechanics 329




7.9 Check Your Understanding Find the expectation value of the position for a particle in the ground stateof a harmonic oscillator using symmetry.


Quantum probability density distributions change in character for excited states, becoming more like the classicaldistribution when the quantum number gets higher. We observe this change already for the first excited state of a quantum
oscillator because the distribution |ψ1(x)|  2 peaks up around the turning points and vanishes at the equilibrium position,
as seen in Figure 7.13. In accordance with Bohr’s correspondence principle, in the limit of high quantum numbers, thequantum description of a harmonic oscillator converges to the classical description, which is illustrated in Figure 7.15.The classical probability density distribution corresponding to the quantum energy of the n = 12 state is a reasonably good
approximation of the quantum probability distribution for a quantum oscillator in this excited state. This agreement becomesincreasingly better for highly excited states.


Figure 7.15 The probability density distribution for finding the quantum harmonic oscillator in its n = 12
quantum state. The dashed curve shows the probability density distribution of a classical oscillator with thesame energy.


7.6 | The Quantum Tunneling of Particles through
Potential Barriers


Learning Objectives
By the end of this section, you will be able to:
• Describe how a quantum particle may tunnel across a potential barrier
• Identify important physical parameters that affect the tunneling probability
• Identify the physical phenomena where quantum tunneling is observed
• Explain how quantum tunneling is utilized in modern technologies


Quantum tunneling is a phenomenon in which particles penetrate a potential energy barrier with a height greater than thetotal energy of the particles. The phenomenon is interesting and important because it violates the principles of classicalmechanics. Quantum tunneling is important in models of the Sun and has a wide range of applications, such as the scanning


330 Chapter 7 | Quantum Mechanics


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




tunneling microscope and the tunnel diode.
Tunneling and Potential Energy
To illustrate quantum tunneling, consider a ball rolling along a surface with a kinetic energy of 100 J. As the ball rolls,it encounters a hill. The potential energy of the ball placed atop the hill is 10 J. Therefore, the ball (with 100 J of kineticenergy) easily rolls over the hill and continues on. In classical mechanics, the probability that the ball passes over the hillis exactly 1—it makes it over every time. If, however, the height of the hill is increased—a ball placed atop the hill has apotential energy of 200 J—the ball proceeds only part of the way up the hill, stops, and returns in the direction it came. Thetotal energy of the ball is converted entirely into potential energy before it can reach the top of the hill. We do not expect,even after repeated attempts, for the 100-J ball to ever be found beyond the hill. Therefore, the probability that the ballpasses over the hill is exactly 0, and probability it is turned back or “reflected” by the hill is exactly 1. The ball never makesit over the hill. The existence of the ball beyond the hill is an impossibility or “energetically forbidden.”
However, according to quantum mechanics, the ball has a wave function and this function is defined over all space. Thewave function may be highly localized, but there is always a chance that as the ball encounters the hill, the ball will suddenlybe found beyond it. Indeed, this probability is appreciable if the “wave packet” of the ball is wider than the barrier.


View this interactive simulation (https://openstaxcollege.org/l/21intquatanvid) for a simulation oftunneling.


In the language of quantum mechanics, the hill is characterized by a potential barrier. A finite-height square barrier isdescribed by the following potential-energy function:


(7.59)
U(x) =







0, when x < 0
U0, when 0 ≤ x ≤ L


0, when x > L.


The potential barrier is illustrated in Figure 7.16. When the height U0 of the barrier is infinite, the wave packet
representing an incident quantum particle is unable to penetrate it, and the quantum particle bounces back from the barrierboundary, just like a classical particle. When the width L of the barrier is infinite and its height is finite, a part of the wavepacket representing an incident quantum particle can filter through the barrier boundary and eventually perish after travelingsome distance inside the barrier.


Chapter 7 | Quantum Mechanics 331




Figure 7.16 A potential energy barrier of height U0 creates three
physical regions with three different wave behaviors. In region Iwhere x < 0 , an incident wave packet (incident particle) moves in
a potential-free zone and coexists with a reflected wave packet(reflected particle). In region II, a part of the incident wave that hasnot been reflected at x = 0 moves as a transmitted wave in a
constant potential U(x) = + U0 and tunnels through to region III
at x = L . In region III for x > L , a wave packet (transmitted
particle) that has tunneled through the potential barrier moves as afree particle in potential-free zone. The energy E of the incidentparticle is indicated by the horizontal line.


When both the width L and the height U0 are finite, a part of the quantum wave packet incident on one side of the barrier
can penetrate the barrier boundary and continue its motion inside the barrier, where it is gradually attenuated on its way tothe other side. A part of the incident quantum wave packet eventually emerges on the other side of the barrier in the form ofthe transmitted wave packet that tunneled through the barrier. How much of the incident wave can tunnel through a barrierdepends on the barrier width L and its height U0 , and on the energy E of the quantum particle incident on the barrier. This
is the physics of tunneling.
Barrier penetration by quantum wave functions was first analyzed theoretically by Friedrich Hund in 1927, shortly afterSchrӧdinger published the equation that bears his name. A year later, George Gamow used the formalism of quantummechanics to explain the radioactive α -decay of atomic nuclei as a quantum-tunneling phenomenon. The invention of
the tunnel diode in 1957 made it clear that quantum tunneling is important to the semiconductor industry. In modernnanotechnologies, individual atoms are manipulated using a knowledge of quantum tunneling.
Tunneling and the Wave Function
Suppose a uniform and time-independent beam of electrons or other quantum particles with energy E traveling along thex-axis (in the positive direction to the right) encounters a potential barrier described by Equation 7.59. The question is:What is the probability that an individual particle in the beam will tunnel through the potential barrier? The answer can befound by solving the boundary-value problem for the time-independent Schrӧdinger equation for a particle in the beam.The general form of this equation is given by Equation 7.60, which we reproduce here:


(7.60)
− ℏ


2


2m
d2ψ(x)


dx2
+ U(x)ψ(x) = Eψ(x), where −∞ < x < +∞.


In Equation 7.60, the potential function U(x) is defined by Equation 7.59. We assume that the given energy E of theincoming particle is smaller than the height U0 of the potential barrier, E < U0 , because this is the interesting physical
case. Knowing the energy E of the incoming particle, our task is to solve Equation 7.60 for a function ψ(x) that is


332 Chapter 7 | Quantum Mechanics


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




continuous and has continuous first derivatives for all x. In other words, we are looking for a “smooth-looking” solution
(because this is how wave functions look) that can be given a probabilistic interpretation so that |ψ(x)|2 = ψ* (x)ψ(x) is
the probability density.
We divide the real axis into three regions with the boundaries defined by the potential function in Equation 7.59(illustrated in Figure 7.16) and transcribe Equation 7.60 for each region. Denoting by ψI(x) the solution in region I
for x < 0 , by ψII(x) the solution in region II for 0 ≤ x ≤ L , and by ψIII(x) the solution in region III for x > L , the
stationary Schrӧdinger equation has the following forms in these three regions:


(7.61)
− ℏ


2


2m
d2ψI(x)


dx2
= EψI(x), in region I: − ∞ < x < 0,


(7.62)
− ℏ


2


2m
d2ψII(x)


dx2
+ U0ψII(x) = EψII(x), in region II: 0 ≤ x ≤ L,


(7.63)
− ℏ


2


2m
d2ψIII(x)


dx2
= EψIII(x), in region III: L < x < +∞.


The continuity condition at region boundaries requires that:
(7.64)ψI(0) = ψII(0), at the boundary between regions I and II and


and
(7.65)ψII(L) = ψIII(L), at the boundary between regions II and III.


The “smoothness” condition requires the first derivative of the solution be continuous at region boundaries:
(7.66)dψI(x)


dx |x = 0 = dψII(x)dx |x = 0, at the boundary between regions I and II;
and


(7.67)dψII(x)
dx |x = L = dψIII(x)dx |x = L, at the boundary between regions II and III.


In what follows, we find the functions ψI(x) , ψII(x) , and ψIII(x) .
We can easily verify (by substituting into the original equation and differentiating) that in regions I and III, the solutionsmust be in the following general forms:


(7.68)ψI(x) = Ae+ikx + Be−ikx
(7.69)ψIII(x) = Fe+ikx + Ge−ikx


where k = 2mE/ℏ is a wave number and the complex exponent denotes oscillations,
(7.70)e±ikx = cos kx ± i sin kx.


The constants A, B, F, and G in Equation 7.68 and Equation 7.69 may be complex. These solutions are illustrated inFigure 7.16. In region I, there are two waves—one is incident (moving to the right) and one is reflected (moving to theleft)—so none of the constants A and B in Equation 7.68 may vanish. In region III, there is only one wave (moving to theright), which is the transmitted wave, so the constant G must be zero in Equation 7.69, G = 0 . We can write explicitly
that the incident wave is ψin(x) = Ae+ikx and that the reflected wave is ψref(x) = Be−ikx , and that the transmitted wave
is ψtra(x) = Fe+ikx . The amplitude of the incident wave is


|ψin(x)|
2
= ψin* (x)ψin(x) =



⎝Ae


+ikx⎞
⎠* Ae


+ikx = A* e−ikx Ae+ikx = A* A = |A|
2
.


Similarly, the amplitude of the reflected wave is |ψref(x)|2 = |B|2 and the amplitude of the transmitted wave is
|ψtra(x)|2 = |F|2 . We know from the theory of waves that the square of the wave amplitude is directly proportional to the
wave intensity. If we want to know how much of the incident wave tunnels through the barrier, we need to compute the


Chapter 7 | Quantum Mechanics 333




square of the amplitude of the transmitted wave. The transmission probability or tunneling probability is the ratio of the
transmitted intensity (|F|2) to the incident intensity (|A|2) , written as


(7.71)
T(L, E) = |ψtra(x)|


2


|ψin(x)|2
= |F|


2


|A|2
= |FA |


2


where L is the width of the barrier and E is the total energy of the particle. This is the probability an individual particle inthe incident beam will tunnel through the potential barrier. Intuitively, we understand that this probability must depend onthe barrier height U0 .
In region II, the terms in equation Equation 7.62 can be rearranged to


(7.72)d2ψII(x)
dx2


= β2ψII(x)


where β2 is positive because U0 > E and the parameter β is a real number,


(7.73)β2 = 2m

2
(U0 − E).


The general solution to Equation 7.72 is not oscillatory (unlike in the other regions) and is in the form of exponentialsthat describe a gradual attenuation of ψII(x) ,
(7.74)ψII(x) = Ce−βx + De+βx.


The two types of solutions in the three regions are illustrated in Figure 7.17.


Figure 7.17 Three types of solutions to the stationarySchrӧdinger equation for the quantum-tunneling problem:Oscillatory behavior in regions I and III where a quantum particlemoves freely, and exponential-decay behavior in region II (thebarrier region) where the particle moves in the potential U0 .


Now we use the boundary conditions to find equations for the unknown constants. Equation 7.68 and Equation 7.74 are


334 Chapter 7 | Quantum Mechanics


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




substituted into Equation 7.64 to give
(7.75)A + B = C + D.


Equation 7.74 and Equation 7.69 are substituted into Equation 7.65 to give
(7.76)Ce−βL + De+βL = Fe+ikL.


Similarly, we substitute Equation 7.68 and Equation 7.74 into Equation 7.66, differentiate, and obtain
(7.77)−ik(A − B) = β(D − C).


Similarly, the boundary condition Equation 7.67 reads explicitly
(7.78)β⎛⎝De+βL − Ce−βL⎞⎠ = −ikFe+ikL.


We now have four equations for five unknown constants. However, because the quantity we are after is the transmissioncoefficient, defined in Equation 7.71 by the fraction F/A, the number of equations is exactly right because when we divideeach of the above equations by A, we end up having only four unknown fractions: B/A, C/A, D/A, and F/A, three of whichcan be eliminated to find F/A. The actual algebra that leads to expression for F/A is pretty lengthy, but it can be done eitherby hand or with a help of computer software. The end result is
(7.79)F


A
= e


−ikL


cosh (βL) + i(γ/2)sinh (βL)
.


In deriving Equation 7.79, to avoid the clutter, we use the substitutions γ ≡ β/k − k/β ,
cosh y = e


y + e
−y


2
, and sinh y = e


y − e
−y


2
.


We substitute Equation 7.79 into Equation 7.71 and obtain the exact expression for the transmission coefficient for thebarrier,
T(L, E) =




F
A

⎠*


F
A


= e
+ikL


cosh (βL) − i(γ/2)sinh (βL)
· e


−ikL


cosh (βL) + i(γ/2)sinh (βL)


or


(7.80)T(L, E) = 1
cosh2(βL) + (γ/2)2 sinh2(βL)


where


γ
2



2
= 1


4


1 − E/U0
E/U0


+
E/U0


1 − E/U0
− 2

⎠.


For a wide and high barrier that transmits poorly, Equation 7.80 can be approximated by


(7.81)
T(L, E) = 16 E


U0



⎝1 −


E
U0



⎠e


−2βL
.


Whether it is the exact expression Equation 7.80 or the approximate expression Equation 7.81, we see that the tunnelingeffect very strongly depends on the width L of the potential barrier. In the laboratory, we can adjust both the potential height
U0 and the width L to design nano-devices with desirable transmission coefficients.


Chapter 7 | Quantum Mechanics 335




Example 7.12
Transmission Coefficient
Two copper nanowires are insulated by a copper oxide nano-layer that provides a 10.0-eV potential barrier.Estimate the tunneling probability between the nanowires by 7.00-eV electrons through a 5.00-nm thick oxidelayer. What if the thickness of the layer were reduced to just 1.00 nm? What if the energy of electrons wereincreased to 9.00 eV?
Strategy
Treating the insulating oxide layer as a finite-height potential barrier, we use Equation 7.81. We identify
U0 = 10.0 eV , E1 = 7.00 eV , E2 = 9.00 eV , L1 = 5.00 nm , and L2 = 1.00 nm . We use Equation 7.73
to compute the exponent. Also, we need the rest mass of the electron m = 511 keV/c2 and Planck’s constant
ℏ = 0.1973keV · nm/c . It is typical for this type of estimate to deal with very small quantities that are often not
suitable for handheld calculators. To make correct estimates of orders, we make the conversion ey = 10y/ln 10 .
Solution
Constants:


2m

2
= 2(511 keV/c


2)
(0.1973keV · nm/c)2


= 26, 254 1
keV · (nm)2


,


β = 2m

2
(U0 − E) = 26, 254


(10.0 eV − E)
keV · (nm)2


= 26.254(10.0 eV − E)/eV 1nm.


For a lower-energy electron with E1 = 7.00 eV :
β1 = 26.254(10.00 eV − E1)/eV


1
nm = 26.254(10.00 − 7.00)


1
nm =


8.875
nm ,


T(L, E1) = 16
E1
U0



⎝1 −


E1
U0



⎠e


−2β1L = 16 7
10

⎝1 −


7
10

⎠e


−17.75 L/nm = 3.36e−17.75 L/nm.


For a higher-energy electron with E2 = 9.00 eV :
β2 = 26.254(10.00 eV − E2)/eV


1
nm = 26.254(10.00 − 9.00)


1
nm =


5.124
nm ,


T(L, E2) = 16
E2
U0



⎝1 −


E2
U0



⎠e


−2β2 L = 16 9
10

⎝1 −


9
10

⎠e


−5.12 L/nm = 1.44e−5.12 L/nm.


For a broad barrier with L1 = 5.00 nm :
T(L1, E1) = 3.36e


−17.75 L1 /nm = 3.36e−17.75 · 5.00 nm/nm = 3.36e−88 = 3.36(6.2 × 10−39) = 2.1% × 10−36,


T(L1, E2) = 1.44e
−5.12 L1 /nm = 1.44e−5.12 · 5.00 nm/nm = 1.44e−25.6 = 1.44(7.62 × 10−12) = 1.1% × 10−9.


For a narrower barrier with L2 = 1.00 nm :
T(L2, E1) = 3.36e


−17.75 L2 /nm = 3.36e−17.75 · 1.00 nm/nm = 3.36e−17.75 = 3.36(5.1 × 10−7) = 1.7% × 10−4,


T(L2, E2) = 1.44e
−5.12 L2 /nm = 1.44e−5.12 · 1.00 nm/nm = 1.44e−5.12 = 1.44(5.98 × 10−3) = 0.86%.


Significance
We see from these estimates that the probability of tunneling is affected more by the width of the potential barrierthan by the energy of an incident particle. In today’s technologies, we can manipulate individual atoms on metalsurfaces to create potential barriers that are fractions of a nanometer, giving rise to measurable tunneling currents.One of many applications of this technology is the scanning tunneling microscope (STM), which we discuss laterin this section.


336 Chapter 7 | Quantum Mechanics


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




7.10 Check Your Understanding A proton with kinetic energy 1.00 eV is incident on a square potentialbarrier with height 10.00 eV. If the proton is to have the same transmission probability as an electron of thesame energy, what must the width of the barrier be relative to the barrier width encountered by an electron?


Radioactive Decay
In 1928, Gamow identified quantum tunneling as the mechanism responsible for the radioactive decay of atomic nuclei.He observed that some isotopes of thorium, uranium, and bismuth disintegrate by emitting α -particles (which are doubly
ionized helium atoms or, simply speaking, helium nuclei). In the process of emitting an α -particle, the original nucleus
is transformed into a new nucleus that has two fewer neutrons and two fewer protons than the original nucleus. The α
-particles emitted by one isotope have approximately the same kinetic energies. When we look at variations of these energiesamong isotopes of various elements, the lowest kinetic energy is about 4 MeV and the highest is about 9 MeV, so theseenergies are of the same order of magnitude. This is about where the similarities between various isotopes end.
When we inspect half-lives (a half-life is the time in which a radioactive sample loses half of its nuclei due to decay),different isotopes differ widely. For example, the half-life of polonium-214 is 160 µs and the half-life of uranium is 4.5
billion years. Gamow explained this variation by considering a ‘spherical-box’ model of the nucleus, where α -particles can
bounce back and forth between the walls as free particles. The confinement is provided by a strong nuclear potential at aspherical wall of the box. The thickness of this wall, however, is not infinite but finite, so in principle, a nuclear particlehas a chance to escape this nuclear confinement. On the inside wall of the confining barrier is a high nuclear potential thatkeeps the α -particle in a small confinement. But when an α -particle gets out to the other side of this wall, it is subject
to electrostatic Coulomb repulsion and moves away from the nucleus. This idea is illustrated in Figure 7.18. The width Lof the potential barrier that separates an α -particle from the outside world depends on the particle’s kinetic energy E. This
width is the distance between the point marked by the nuclear radius R and the point R0 where an α -particle emerges on
the other side of the barrier, L = R0 − R . At the distance R0 , its kinetic energy must at least match the electrostatic energy
of repulsion, E = (4πε0)−1Ze2 /R0 (where +Ze is the charge of the nucleus). In this way we can estimate the width of
the nuclear barrier,


L = e
2


4πε0
Z
E


− R.


We see from this estimate that the higher the energy of α -particle, the narrower the width of the barrier that it is to tunnel
through. We also know that the width of the potential barrier is the most important parameter in tunneling probability. Thus,highly energetic α -particles have a good chance to escape the nucleus, and, for such nuclei, the nuclear disintegration
half-life is short. Notice that this process is highly nonlinear, meaning a small increase in the α -particle energy has a
disproportionately large enhancing effect on the tunneling probability and, consequently, on shortening the half-life. Thisexplains why the half-life of polonium that emits 8-MeV α -particles is only hundreds of milliseconds and the half-life of
uranium that emits 4-MeV α -particles is billions of years.


Chapter 7 | Quantum Mechanics 337




Figure 7.18 The potential energy barrier for an α -particle
bound in the nucleus: To escape from the nucleus, an α -particle
with energy E must tunnel across the barrier from distance R todistance R0 away from the center.


Field Emission
Field emission is a process of emitting electrons from conducting surfaces due to a strong external electric field that isapplied in the direction normal to the surface (Figure 7.19). As we know from our study of electric fields in earlierchapters, an applied external electric field causes the electrons in a conductor to move to its surface and stay there as longas the present external field is not excessively strong. In this situation, we have a constant electric potential throughoutthe inside of the conductor, including its surface. In the language of potential energy, we say that an electron insidethe conductor has a constant potential energy U(x) = −U0 (here, the x means inside the conductor). In the situation
represented in Figure 7.19, where the external electric field is uniform and has magnitude Eg , if an electron happens
to be outside the conductor at a distance x away from its surface, its potential energy would have to be U(x) = −eEg x
(here, x denotes distance to the surface). Taking the origin at the surface, so that x = 0 is the location of the surface,
we can represent the potential energy of conduction electrons in a metal as the potential energy barrier shown in Figure7.20. In the absence of the external field, the potential energy becomes a step barrier defined by U(x ≤ 0) = −U0 and by
U(x > 0) = 0 .


338 Chapter 7 | Quantum Mechanics


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 7.19 A normal-direction external electric field at thesurface of a conductor: In a strong field, the electrons on aconducting surface may get detached from it and accelerateagainst the external electric field away from the surface.


Figure 7.20 The potential energy barrier at the surface of a metallic conductor in thepresence of an external uniform electric field Eg normal to the surface: It becomes a
step-function barrier when the external field is removed. The work function of the metalis indicated by ϕ.


When an external electric field is strong, conduction electrons at the surface may get detached from it and accelerate alongelectric field lines in a direction antiparallel to the external field, away from the surface. In short, conduction electrons mayescape from the surface. The field emission can be understood as the quantum tunneling of conduction electrons throughthe potential barrier at the conductor’s surface. The physical principle at work here is very similar to the mechanism of α
-emission from a radioactive nucleus.


Chapter 7 | Quantum Mechanics 339




Suppose a conduction electron has a kinetic energy E (the average kinetic energy of an electron in a metal is the workfunction ϕ for the metal and can be measured, as discussed for the photoelectric effect in Photons and Matter Waves),
and an external electric field can be locally approximated by a uniform electric field of strength Eg . The width L of the
potential barrier that the electron must cross is the distance from the conductor’s surface to the point outside the surfacewhere its kinetic energy matches the value of its potential energy in the external field. In Figure 7.20, this distance ismeasured along the dashed horizontal line U(x) = E from x = 0 to the intercept with U(x) = −eEg x , so the barrier
width is


L = e
−1E
Eg


=
e−1ϕ
Eg


.


We see that L is inversely proportional to the strength Eg of an external field. When we increase the strength of the external
field, the potential barrier outside the conductor becomes steeper and its width decreases for an electron with a given kineticenergy. In turn, the probability that an electron will tunnel across the barrier (conductor surface) becomes exponentiallylarger. The electrons that emerge on the other side of this barrier form a current (tunneling-electron current) that canbe detected above the surface. The tunneling-electron current is proportional to the tunneling probability. The tunnelingprobability depends nonlinearly on the barrier width L, and L can be changed by adjusting Eg . Therefore, the tunneling-
electron current can be tuned by adjusting the strength of an external electric field at the surface. When the strength ofan external electric field is constant, the tunneling-electron current has different values at different elevations L above thesurface.
The quantum tunneling phenomenon at metallic surfaces, which we have just described, is the physical principle behind theoperation of the scanning tunneling microscope (STM), invented in 1981 by Gerd Binnig and Heinrich Rohrer. The STMdevice consists of a scanning tip (a needle, usually made of tungsten, platinum-iridium, or gold); a piezoelectric device thatcontrols the tip’s elevation in a typical range of 0.4 to 0.7 nm above the surface to be scanned; some device that controls themotion of the tip along the surface; and a computer to display images. While the sample is kept at a suitable voltage bias,the scanning tip moves along the surface (Figure 7.21), and the tunneling-electron current between the tip and the surfaceis registered at each position. The amount of the current depends on the probability of electron tunneling from the surface tothe tip, which, in turn, depends on the elevation of the tip above the surface. Hence, at each tip position, the distance fromthe tip to the surface is measured by measuring how many electrons tunnel out from the surface to the tip. This method cangive an unprecedented resolution of about 0.001 nm, which is about 1% of the average diameter of an atom. In this way, wecan see individual atoms on the surface, as in the image of a carbon nanotube in Figure 7.22.


340 Chapter 7 | Quantum Mechanics


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Figure 7.21 In STM, a surface at a constant potential is being scanned by a narrow tipmoving along the surface. When the STM tip moves close to surface atoms, electrons cantunnel from the surface to the tip. This tunneling-electron current is continually monitoredwhile the tip is in motion. The amount of current at location (x,y) gives information aboutthe elevation of the tip above the surface at this location. In this way, a detailedtopographical map of the surface is created and displayed on a computer monitor.


Figure 7.22 An STM image of a carbon nanotube: Atomic-scale resolution allows us to see individual atoms on the surface.STM images are in gray scale, and coloring is added to bring updetails to the human eye.
Resonant Quantum Tunneling
Quantum tunneling has numerous applications in semiconductor devices such as electronic circuit components or integratedcircuits that are designed at nanoscales; hence, the term ‘ nanotechnology.’ For example, a diode (an electric-circuitelement that causes an electron current in one direction to be different from the current in the opposite direction, whenthe polarity of the bias voltage is reversed) can be realized by a tunneling junction between two different types ofsemiconducting materials. In such a tunnel diode, electrons tunnel through a single potential barrier at a contact betweentwo different semiconductors. At the junction, tunneling-electron current changes nonlinearly with the applied potentialdifference across the junction and may rapidly decrease as the bias voltage is increased. This is unlike the Ohm’s lawbehavior that we are familiar with in household circuits. This kind of rapid behavior (caused by quantum tunneling) isdesirable in high-speed electronic devices.
Another kind of electronic nano-device utilizes resonant tunneling of electrons through potential barriers that occur inquantum dots. A quantum dot is a small region of a semiconductor nanocrystal that is grown, for example, in a silicon oraluminum arsenide crystal. Figure 7.23(a) shows a quantum dot of gallium arsenide embedded in an aluminum arsenidewafer. The quantum-dot region acts as a potential well of a finite height (shown in Figure 7.23(b)) that has two finite-height potential barriers at dot boundaries. Similarly, as for a quantum particle in a box (that is, an infinite potential well),lower-lying energies of a quantum particle trapped in a finite-height potential well are quantized. The difference betweenthe box and the well potentials is that a quantum particle in a box has an infinite number of quantized energies and is trappedin the box indefinitely, whereas a quantum particle trapped in a potential well has a finite number of quantized energy levels


Chapter 7 | Quantum Mechanics 341




and can tunnel through potential barriers at well boundaries to the outside of the well. Thus, a quantum dot of galliumarsenide sitting in aluminum arsenide is a potential well where low-lying energies of an electron are quantized, indicated as
Edot in part (b) in the figure. When the energy Eelectron of an electron in the outside region of the dot does not match its
energy Edot that it would have in the dot, the electron does not tunnel through the region of the dot and there is no current
through such a circuit element, even if it were kept at an electric voltage difference (bias). However, when this voltage biasis changed in such a way that one of the barriers is lowered, so that Edot and Eelectron become aligned, as seen in part (c)
of the figure, an electron current flows through the dot. When the voltage bias is now increased, this alignment is lost andthe current stops flowing. When the voltage bias is increased further, the electron tunneling becomes improbable until thebias voltage reaches a value for which the outside electron energy matches the next electron energy level in the dot. Theword ‘resonance’ in the device name means that the tunneling-electron current occurs only when a selected energy levelis matched by tuning an applied voltage bias, such as in the operation mechanism of the resonant-tunneling diode justdescribed. Resonant-tunneling diodes are used as super-fast nano-switches.


Figure 7.23 Resonant-tunneling diode: (a) A quantum dot of gallium arsenide embedded in aluminum arsenide.(b) Potential well consisting of two potential barriers of a quantum dot with no voltage bias. Electron energies
Eelectron in aluminum arsenide are not aligned with their energy levels Edot in the quantum dot, so electrons do
not tunnel through the dot. (c) Potential well of the dot with a voltage bias across the device. A suitably tunedvoltage difference distorts the well so that electron-energy levels in the dot are aligned with their energies inaluminum arsenide, causing the electrons to tunnel through the dot.


342 Chapter 7 | Quantum Mechanics


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




anti-symmetric function
Born interpretation
complex function
Copenhagen interpretation
correspondence principle
energy levels
energy quantum number
energy-time uncertainty principle
even function
expectation value
field emission
ground state energy
Heisenberg’s uncertainty principle


infinite square well
momentum operator
nanotechnology
normalization condition
odd function
position operator
potential barrier
principal quantum number
probability density
quantum dot
quantum tunneling
resonant tunneling
resonant-tunneling diode
scanning tunneling microscope (STM)


CHAPTER 7 REVIEW
KEY TERMS


odd function
states that the square of a wave function is the probability density


function containing both real and imaginary parts
states that when an observer is not looking or when a measurement is not being made, theparticle has many values of measurable quantities, such as position
in the limit of large energies, the predictions of quantum mechanics agree with thepredictions of classical mechanics


states of definite energy, often represented by horizontal lines in an energy “ladder” diagram
index that labels the allowed energy states


energy-time relation for uncertainties in the simultaneous measurements of theenergy of a quantum state and of its lifetime
in one dimension, a function symmetric with the origin of the coordinate system


average value of the physical quantity assuming a large number of particles with the same wavefunction
electron emission from conductor surfaces when a strong external electric field is applied in normaldirection to conductor’s surface


lowest energy state in the energy spectrum
places limits on what can be known from a simultaneous measurements ofposition and momentum; states that if the uncertainty on position is small then the uncertainty on momentum is large,and vice versa


potential function that is zero in a fixed range and infinitely beyond this range
operator that corresponds to the momentum of a particle


technology that is based on manipulation of nanostructures such as molecules or individual atoms toproduce nano-devices such as integrated circuits
requires that the probability density integrated over the entire physical space results in thenumber one


in one dimension, a function antisymmetric with the origin of the coordinate system
operator that corresponds to the position of a particle
potential function that rises and falls with increasing values of position


energy quantum number
square of the particle’s wave function


small region of a semiconductor nanocrystal embedded in another semiconductor nanocrystal, acting as apotential well for electrons
phenomenon where particles penetrate through a potential energy barrier with a height greater thanthe total energy of the particles
tunneling of electrons through a finite-height potential well that occurs only when electron energiesmatch an energy level in the well, occurs in quantum dots


quantum dot with an applied voltage bias across it
device that utilizes quantum-tunneling phenomenon at metallic surfaces toobtain images of nanoscale structures


Chapter 7 | Quantum Mechanics 343




Schrӧdinger’s time-dependent equation
Schrӧdinger’s time-independent equation


standing wave state
state reduction
stationary state
time-modulation factor
transmission probability
tunnel diode
tunneling probability
wave function
wave function collapse
wave packet


equation in space and time that allows us to determine wave functions of aquantum particle
equation in space that allows us to determine wave functions of aquantum particle; this wave function must be multiplied by a time-modulation factor to obtain the time-dependent wave function


stationary state for which the real and imaginary parts of Ψ(x, t) oscillate up and down like a
standing wave (often modeled with sine and cosine functions)


hypothetical process in which an observed or detected particle “jumps into” a definite state, oftendescribed in terms of the collapse of the particle’s wave function
state for which the probability density function, |Ψ(x, t)|2 , does not vary in time


factor e−iωt that multiplies the time-independent wave function when the potential energy of
the particle is time independent


also called tunneling probability, the probability that a particle will tunnel through apotential barrier
electron tunneling-junction between two different semiconductors


also called transmission probability, the probability that a particle will tunnel through a potentialbarrier
function that represents the quantum state of a particle (quantum system)


equivalent to state reduction
superposition of many plane matter waves that can be used to represent a localized particle


KEY EQUATIONS
Normalization condition in one dimension


P(x = −∞, +∞) = ∫
−∞




|Ψ(x, t)|2dx = 1


Probability of finding a particle in a narrow interval ofposition in one dimension (x, x + dx) P(x, x + dx) = Ψ* (x, t)Ψ(x, t)dx
Expectation value of position in one dimension


〈 x 〉 = ∫
−∞




Ψ* (x, t)xΨ(x, t)dx


Heisenberg’s position-momentum uncertainty principle ΔxΔp ≥ ℏ
2


Heisenberg’s energy-time uncertainty principle ΔEΔt ≥ ℏ
2


Schrӧdinger’s time-dependent equation
− ℏ


2


2m
∂2Ψ(x, t)


∂ x2
+ U(x, t)Ψ(x, t) = iℏ∂


2Ψ(x, t)
∂ t


General form of the wave function for a time-independent potential in one dimension Ψ(x, t) = ψ(x)e−iωt
Schrӧdinger’s time-independent equation


− ℏ
2


2m
d2ψ(x)


dx2
+ U(x)ψ(x) = Eψ(x)


Schrӧdinger’s equation (free particle)
− ℏ


2


2m
∂2ψ(x)


∂ x2
= Eψ(x)


Allowed energies (particle in box of length L)
En = n


2 π2ℏ2


2mL2
, n = 1, 2, 3, ...


344 Chapter 7 | Quantum Mechanics


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




Stationary states (particle in a box of length L) ψn(x) = 2L sin nπxL , n = 1, 2, 3, ...
Potential-energy function of a harmonic oscillator U(x) = 1


2
mω2 x2


Stationary Schrӧdinger equation
− ℏ
2m


d2ψ(x)


dx2
+ 1


2
mω2 x2ψ(x) = Eψ(x)


The energy spectrum En = ⎛⎝n + 12⎞⎠ℏω, n = 0, 1, 2, 3, ...
The energy wave functions


ψn(x) = Nn e
−β2 x2 /2


Hn(βx), n = 0, 1, 2, 3, ...


Potential barrier
U(x) =







0, when x < 0
U0, when 0 ≤ x ≤ L


0, when x > L


Definition of the transmission coefficient
T(L, E) = |ψtra(x)|


2


|ψin(x)|2


A parameter in the transmission coefficient β2 = 2m

2
(U0 − E)


Transmission coefficient, exact T(L, E) = 1
cosh2 βL + (γ/2)2 sinh2 βL


Transmission coefficient, approximate
T(L, E) = 16 E


U0



⎝1 −


E
U0



⎠e


−2β L


SUMMARY
7.1 Wave Functions


• In quantum mechanics, the state of a physical system is represented by a wave function.
• In Born’s interpretation, the square of the particle’s wave function represents the probability density of finding theparticle around a specific location in space.
• Wave functions must first be normalized before using them to make predictions.
• The expectation value is the average value of a quantity that requires a wave function and an integration.


7.2 The Heisenberg Uncertainty Principle
• The Heisenberg uncertainty principle states that it is impossible to simultaneously measure the x-components ofposition and of momentum of a particle with an arbitrarily high precision. The product of experimental uncertaintiesis always larger than or equal to ℏ/2.
• The limitations of this principle have nothing to do with the quality of the experimental apparatus but originate inthe wave-like nature of matter.
• The energy-time uncertainty principle expresses the experimental observation that a quantum state that exists onlyfor a short time cannot have a definite energy.


7.3 The Schrӧdinger Equation
• The Schrӧdinger equation is the fundamental equation of wave quantum mechanics. It allows us to make predictionsabout wave functions.
• When a particle moves in a time-independent potential, a solution of the time-dependent Schrӧdinger equation is a


Chapter 7 | Quantum Mechanics 345




product of a time-independent wave function and a time-modulation factor.
• The Schrӧdinger equation can be applied to many physical situations.


7.4 The Quantum Particle in a Box
• Energy states of a quantum particle in a box are found by solving the time-independent Schrӧdinger equation.
• To solve the time-independent Schrӧdinger equation for a particle in a box and find the stationary states and allowedenergies, we require that the wave function terminate at the box wall.
• Energy states of a particle in a box are quantized and indexed by principal quantum number.
• The quantum picture differs significantly from the classical picture when a particle is in a low-energy state of a lowquantum number.
• In the limit of high quantum numbers, when the quantum particle is in a highly excited state, the quantumdescription of a particle in a box coincides with the classical description, in the spirit of Bohr’s correspondenceprinciple.


7.5 The Quantum Harmonic Oscillator
• The quantum harmonic oscillator is a model built in analogy with the model of a classical harmonic oscillator. Itmodels the behavior of many physical systems, such as molecular vibrations or wave packets in quantum optics.
• The allowed energies of a quantum oscillator are discrete and evenly spaced. The energy spacing is equal to Planck’senergy quantum.
• The ground state energy is larger than zero. This means that, unlike a classical oscillator, a quantum oscillator isnever at rest, even at the bottom of a potential well, and undergoes quantum fluctuations.
• The stationary states (states of definite energy) have nonzero values also in regions beyond classical turning points.When in the ground state, a quantum oscillator is most likely to be found around the position of the minimum of thepotential well, which is the least-likely position for a classical oscillator.
• For high quantum numbers, the motion of a quantum oscillator becomes more similar to the motion of a classicaloscillator, in accordance with Bohr’s correspondence principle.


7.6 The Quantum Tunneling of Particles through Potential Barriers
• A quantum particle that is incident on a potential barrier of a finite width and height may cross the barrier and appearon its other side. This phenomenon is called ‘quantum tunneling.’ It does not have a classical analog.
• To find the probability of quantum tunneling, we assume the energy of an incident particle and solve the stationarySchrӧdinger equation to find wave functions inside and outside the barrier. The tunneling probability is a ratio ofsquared amplitudes of the wave past the barrier to the incident wave.
• The tunneling probability depends on the energy of the incident particle relative to the height of the barrier and onthe width of the barrier. It is strongly affected by the width of the barrier in a nonlinear, exponential way so that asmall change in the barrier width causes a disproportionately large change in the transmission probability.
• Quantum-tunneling phenomena govern radioactive nuclear decays. They are utilized in many modern technologiessuch as STM and nano-electronics. STM allows us to see individual atoms on metal surfaces. Electron-tunnelingdevices have revolutionized electronics and allow us to build fast electronic devices of miniature sizes.


CONCEPTUAL QUESTIONS
7.1 Wave Functions
1. What is the physical unit of a wave function, Ψ(x, t)?
What is the physical unit of the square of this wavefunction?
2. Can the magnitude of a wave function
(Ψ* (x, t) Ψ(x, t)) be a negative number? Explain.


3. What kind of physical quantity does a wave function ofan electron represent?
4. What is the physical meaning of a wave function of aparticle?
5. What is the meaning of the expression “expectationvalue?” Explain.


346 Chapter 7 | Quantum Mechanics


This OpenStax book is available for free at http://cnx.org/content/col12067/1.4




7.2 The Heisenberg Uncertainty Principle
6. If the formalism of quantum mechanics is ‘more exact’than that of classical mechanics, why don’t we use quantummechanics to describe the motion of a leaping frog?Explain.
7. Can the de Broglie wavelength of a particle be knownprecisely? Can the position of a particle be knownprecisely?
8. Can we measure the energy of a free localized particlewith complete precision?
9. Can we measure both the position and momentum of aparticle with complete precision?


7.3 The Schrӧdinger Equation
10. What is the difference between a wave function
ψ(x, y, z) and a wave function Ψ(x, y, z, t) for the same
particle?
11. If a quantum particle is in a stationary state, does itmean that it does not move?
12. Explain the difference between time-dependent and-independent Schrӧdinger’s equations.
13. Suppose a wave function is discontinuous at somepoint. Can this function represent a quantum state of somephysical particle? Why? Why not?


7.4 The Quantum Particle in a Box
14. Using the quantum particle in a box model, describehow the possible energies of the particle are related to thesize of the box.
15. Is it possible that when we measure the energy of aquantum particle in a box, the measurement may return asmaller value than the ground state energy? What is thehighest value of the energy that we can measure for thisparticle?
16. For a quantum particle in a box, the first excited state
(Ψ2) has zero value at the midpoint position in the box, so
that the probability density of finding a particle at this point


is exactly zero. Explain what is wrong with the followingreasoning: “If the probability of finding a quantum particleat the midpoint is zero, the particle is never at this point,right? How does it come then that the particle can cross thispoint on its way from the left side to the right side of thebox?


7.5 The Quantum Harmonic Oscillator
17. Is it possible to measure energy of 0.75ℏω for a
quantum harmonic oscillator? Why? Why not? Explain.
18. Explain the connection between Planck’s hypothesisof energy quanta and the energies of the quantum harmonicoscillator.
19. If a classical harmonic oscillator can be at rest, whycan the quantum harmonic oscillator never be at rest? Doesthis violate Bohr’s correspondence principle?
20. Use an example of a quantum particle in a box ora quantum oscillator to explain the physical meaning ofBohr’s correspondence principle.
21. Can we simultaneously measure position and energyof a quantum oscillator? Why? Why not?


7.6 The Quantum Tunneling of Particles
through Potential Barriers
22. When an electron and a proton of the same kineticenergy encounter a potential barrier of the same height andwidth, which one of them will tunnel through the barriermore easily? Why?
23. What decreases the tunneling probability most:doubling the barrier width or halving the kinetic energy ofthe incident particle?
24. Explain the difference between a box-potential and apotential of a quantum dot.
25. Can a quantum particle ‘escape’ from an infinitepotential well like that in a box? Why? Why not?
26. A tunnel diode and a resonant-tunneling diode bothutilize the same physics principle of quantum tunneling. Inwhat important way are they different?


Chapter 7 | Quantum Mechanics 347




PROBLEMS
7.1 Wave