Bootstrap

chromium的部署工具depot_tools和gclient

depot_tools是个工具包,里面包含gclient、gcl、gn和ninja等工具。其中gclient是代码获取工具,它其实是利用了svn和git。主要涉及的depot_tools文件夹下的文件有:gclient、gclient.py、subcommand.py、gclient_utils.py。

gclient文件是个bash脚本:

#########glcient###########
/usr/bin/bash 
base_dir=$(dirname "$0")
 
 if [[ "#grep#fetch#cleanup#diff#" != *"#$1#"* ]]; then
   "$base_dir"/update_depot_tools
 fi
 
 PYTHONDONTWRITEBYTECODE=1 exec python "$base_dir/gclient.py" "$@"


首先,获取脚本的目录并赋值给base_dir,然后判断命令参数1是否为grep|fetch|cleanup|diff,如不是则执行base_dir下的updat_depot_tools脚本,该脚本更新git、svn工具。最后调用当前脚本目录(depot_tools)下的python脚本gclient.py,并把参数都传递给该脚本。

###############glcient.py#######
# 文件最后

def Main(argv):
      .....
   dispatcher = subcommand.CommandDispatcher(__name__)
   try:
     return dispatcher.execute(OptionParser(), argv)
    ......
if '__main__' == __name__:
   sys.exit(Main(sys.argv[1:]))

在gclient.py脚本开始,调用函数Main,参数为bash脚本传递来的。Main主要执行两个过程,一是建立dispatcher对象,参数为当前模块;然后调用dispatcher的execute方法,参数一是OptionParser对象,参数二为传递给Main的参数。

下面进入subcommand模块下的类CommandDispatcher,分析execute方法。

################# subcommand.py ###########
class CommandDispatcher(object):
  def __init__(self, module):
    """module is the name of the main python module where to look for commands.

    The python builtin variable __name__ MUST be used for |module|. If the
    script is executed in the form 'python script.py', __name__ == '__main__'
    and sys.modules['script'] doesn't exist. On the other hand if it is unit
    tested, __main__ will be the unit test's module so it has to reference to
    itself with 'script'. __name__ always match the right value.
    """
    self.module = sys.modules[module]

  def enumerate_commands(self):
    """Returns a dict of command and their handling function.

    The commands must be in the '__main__' modules. To import a command from a
    submodule, use:
      from mysubcommand import CMDfoo

    Automatically adds 'help' if not already defined.

    A command can be effectively disabled by defining a global variable to None,
    e.g.:
      CMDhelp = None
    """
    cmds = dict(
        (fn[3:], getattr(self.module, fn))
        for fn in dir(self.module) if fn.startswith('CMD'))
    cmds.setdefault('help', CMDhelp)
    return cmds

  def find_nearest_command(self, name):
    """Retrieves the function to handle a command.

    It automatically tries to guess the intended command by handling typos or
    incomplete names.
    """
    # Implicitly replace foo-bar to foo_bar since foo-bar is not a valid python
    # symbol but it's faster to type.
    name = name.replace('-', '_')
    commands = self.enumerate_commands()
    if name in commands:
      return commands[name]

    # An exact match was not found. Try to be smart and look if there's
    # something similar.
    commands_with_prefix = [c for c in commands if c.startswith(name)]
    if len(commands_with_prefix) == 1:
      return commands[commands_with_prefix[0]]

    # A #closeenough approximation of levenshtein distance.
    def close_enough(a, b):
      return difflib.SequenceMatcher(a=a, b=b).ratio()

    hamming_commands = sorted(
        ((close_enough(c, name), c) for c in commands),
        reverse=True)
    if (hamming_commands[0][0] - hamming_commands[1][0]) < 0.3:
      # Too ambiguous.
      return

    if hamming_commands[0][0] < 0.8:
      # Not similar enough. Don't be a fool and run a random command.
      return

    return commands[hamming_commands[0][1]]

  def _gen_commands_list(self):
    """Generates the short list of supported commands."""
    commands = self.enumerate_commands()
    docs = sorted(
        (name, self._create_command_summary(name, handler))
        for name, handler in commands.iteritems())
    # Skip commands without a docstring.
    docs = [i for i in docs if i[1]]
    # Then calculate maximum length for alignment:
    length = max(len(c) for c in commands)

    # Look if color is supported.
    colors = _get_color_module()
    green = reset = ''
    if colors:
      green = colors.Fore.GREEN
      reset = colors.Fore.RESET
    return (
        'Commands are:\n' +
        ''.join(
            '  %s%-*s%s %s\n' % (green, length, name, reset, doc)
            for name, doc in docs))

  def _add_command_usage(self, parser, command):
    """Modifies an OptionParser object with the function's documentation."""
    name = command.__name__[3:]
    if name == 'help':
      name = '<command>'
      # Use the module's docstring as the description for the 'help' command if
      # available.
      parser.description = (self.module.__doc__ or '').rstrip()
      if parser.description:
        parser.description += '\n\n'
      parser.description += self._gen_commands_list()
      # Do not touch epilog.
    else:
      # Use the command's docstring if available. For commands, unlike module
      # docstring, realign.
      lines = (command.__doc__ or '').rstrip().splitlines()
      if lines[:1]:
        rest = textwrap.dedent('\n'.join(lines[1:]))
        parser.description = '\n'.join((lines[0], rest))
      else:
        parser.description = lines[0]
      if parser.description:
        parser.description += '\n'
      parser.epilog = getattr(command, 'epilog', None)
      if parser.epilog:
        parser.epilog = '\n' + parser.epilog.strip() + '\n'

    more = getattr(command, 'usage_more', '')
    parser.set_usage(
        'usage: %%prog %s [options]%s' % (name, '' if not more else ' ' + more))

  @staticmethod
  def _create_command_summary(name, command):
    """Creates a oneline summary from the command's docstring."""
    if name != command.__name__[3:]:
      # Skip aliases.
      return ''
    doc = command.__doc__ or ''
    line = doc.split('\n', 1)[0].rstrip('.')
    if not line:
      return line
    return (line[0].lower() + line[1:]).strip()

  def execute(self, parser, args):
    """Dispatches execution to the right command.

    Fallbacks to 'help' if not disabled.
    """
    # Unconditionally disable format_description() and format_epilog().
    # Technically, a formatter should be used but it's not worth (yet) the
    # trouble.
    parser.format_description = lambda _: parser.description or ''
    parser.format_epilog = lambda _: parser.epilog or ''

    if args:
      if args[0] in ('-h', '--help') and len(args) > 1:
        # Inverse the argument order so 'tool --help cmd' is rewritten to
        # 'tool cmd --help'.
        args = [args[1], args[0]] + args[2:]
      command = self.find_nearest_command(args[0])
      if command:
        if command.__name__ == 'CMDhelp' and len(args) > 1:
          # Inverse the arguments order so 'tool help cmd' is rewritten to
          # 'tool cmd --help'. Do it here since we want 'tool hel cmd' to work
          # too.
          args = [args[1], '--help'] + args[2:]
          command = self.find_nearest_command(args[0]) or command

        # "fix" the usage and the description now that we know the subcommand.
        self._add_command_usage(parser, command)
        return command(parser, args[1:])

    cmdhelp = self.enumerate_commands().get('help')
    if cmdhelp:
      # Not a known command. Default to help.
      self._add_command_usage(parser, cmdhelp)
      return cmdhelp(parser, args)

    # Nothing can be done.
    return 2


在gcient.py中,dispatcher = subcomamnd.CommandDispatcher(__name__),可以看出传入的参数是gclient.py这个模块。在CommandDispatcher的__init__中,self.module就是gclent。了解这点对理解后面的enumrate_commands函数有用。从它的名字就可以看出是枚举所有支持的命令,返回的结果是dict,键是命令名称,值为对应的处理函数,如{"sync" : CMDsync}。其他函数如_add_command_usage, _create_command_summary, _gen_commands_list是些辅助函数,功能也比较明确。比较复杂的是find_nearest_command,它是从enumerate_commands生成的dict字典里查找命令,如果没有精确匹配的,就通过模糊查找,并返回对应命令的处理函数。

############# gclient.py ##################
class OptionParser(optparse.OptionParser):
  gclientfile_default = os.environ.get('GCLIENT_FILE', '.gclient')

  def __init__(self, **kwargs):
    optparse.OptionParser.__init__(
        self, version='%prog ' + __version__, **kwargs)

    # Some arm boards have issues with parallel sync.
    if platform.machine().startswith('arm'):
      jobs = 1
    else:
      jobs = max(8, gclient_utils.NumLocalCpus())
    # cmp: 2013/06/19
    # Temporary workaround to lower bot-load on SVN server.
    # Bypassed if a bot_update flag is detected.
    if (os.environ.get('CHROME_HEADLESS') == '1' and
        not os.path.exists('update.flag')):
      jobs = 1

    self.add_option(
        '-j', '--jobs', default=jobs, type='int',
        help='Specify how many SCM commands can run in parallel; defaults to '
             '%default on this machine')
    self.add_option(
        '-v', '--verbose', action='count', default=0,
        help='Produces additional output for diagnostics. Can be used up to '
             'three times for more logging info.')
    self.add_option(
        '--gclientfile', dest='config_filename',
        help='Specify an alternate %s file' % self.gclientfile_default)
    self.add_option(
        '--spec',
        help='create a gclient file containing the provided string. Due to '
            'Cygwin/Python brokenness, it can\'t contain any newlines.')
    self.add_option(
        '--no-nag-max', default=False, action='store_true',
        help='Ignored for backwards compatibility.')

参数OptionParser是gclient.py中的一个类,它继承子模块的optparser.OptionParser。主要功能是解析参数,并返回tuple(options, args),关于OptionParser这个例子说明的很清楚。返回值就是输入参数被格式化分析后的结果。这个函数处理分两个层面,一是如果是帮助查询,一个是执行命令。在帮助查询时,用户输入的命令如:gclient --help sync等。其实,传到execute的时候,参数变成--help sync,那么首先是调换参数位置,使得结果为sync --help,也就是args[0]为"sync", args[1]为--help,然后调用self.fine_nearest_command函数,查找精确命令或者最相似的模糊匹配的命令,并把结果(命令处理函数以CMD开头)赋值给command变量。


进入到execute(),解析参数,find_nearest_commands()得到命令后,判断是不是帮助命令,如果不是就直接调用对应的处理函数return command(parser, args[1:])。用上例来说就是调用CMDsync函数,参数是parser 和sync后的参数。

看看CMDsync是如何处理输入参数的?

#############gclient.py#############
def CMDsync(parser, args):
  """Checkout/update all modules."""
  parser.add_option('-f', '--force', action='store_true',
                    help='force update even for unchanged modules')
  parser.add_option('-n', '--nohooks', action='store_true',
                    help='don\'t run hooks after the update is complete')
  parser.add_option('-p', '--noprehooks', action='store_true',
                    help='don\'t run pre-DEPS hooks', default=False)
  parser.add_option('-r', '--revision', action='append',
                    dest='revisions', metavar='REV', default=[],
                    help='Enforces revision/hash for the solutions with the '
                         'format src@rev. The src@ part is optional and can be '
                         'skipped. -r can be used multiple times when .gclient '
                         'has multiple solutions configured and will work even '
                         'if the src@ part is skipped. Note that specifying '
                         '--revision means your safesync_url gets ignored.')
  parser.add_option('--with_branch_heads', action='store_true',
                    help='Clone git "branch_heads" refspecs in addition to '
                         'the default refspecs. This adds about 1/2GB to a '
                         'full checkout. (git only)')
  parser.add_option('-t', '--transitive', action='store_true',
                    help='When a revision is specified (in the DEPS file or '
                          'with the command-line flag), transitively update '
                          'the dependencies to the date of the given revision. '
                          'Only supported for SVN repositories.')
  parser.add_option('-H', '--head', action='store_true',
                    help='skips any safesync_urls specified in '
                         'configured solutions and sync to head instead')
  parser.add_option('-D', '--delete_unversioned_trees', action='store_true',
                    help='Deletes from the working copy any dependencies that '
                         'have been removed since the last sync, as long as '
                         'there are no local modifications. When used with '
                         '--force, such dependencies are removed even if they '
                         'have local modifications. When used with --reset, '
                         'all untracked directories are removed from the '
                         'working copy, excluding those which are explicitly '
                         'ignored in the repository.')
  parser.add_option('-R', '--reset', action='store_true',
                    help='resets any local changes before updating (git only)')
  parser.add_option('-M', '--merge', action='store_true',
                    help='merge upstream changes instead of trying to '
                         'fast-forward or rebase')
  parser.add_option('--deps', dest='deps_os', metavar='OS_LIST',
                    help='override deps for the specified (comma-separated) '
                         'platform(s); \'all\' will process all deps_os '
                         'references')
  parser.add_option('-m', '--manually_grab_svn_rev', action='store_true',
                    help='Skip svn up whenever possible by requesting '
                         'actual HEAD revision from the repository')
  parser.add_option('--upstream', action='store_true',
                    help='Make repo state match upstream branch.')
  parser.add_option('--output-json',
                    help='Output a json document to this path containing '
                         'summary information about the sync.')
  (options, args) = parser.parse_args(args)
  client = GClient.LoadCurrentConfig(options)

  if not client:
    raise gclient_utils.Error('client not configured; see \'gclient config\'')

  if options.revisions and options.head:
    # TODO(maruel): Make it a parser.error if it doesn't break any builder.
    print('Warning: you cannot use both --head and --revision')

  if options.verbose:
    # Print out the .gclient file.  This is longer than if we just printed the
    # client dict, but more legible, and it might contain helpful comments.
    print(client.config_content)
  ret = client.RunOnDeps('update', args)
  if options.output_json:
    slns = {}
    for d in client.subtree(True):
      normed = d.name.replace('\\', '/').rstrip('/') + '/'
      slns[normed] = {
          'revision': d.got_revision,
          'scm': d.used_scm.name if d.used_scm else None,
      }
    with open(options.output_json, 'wb') as f:
      json.dump({'solutions': slns}, f)
  return ret


CMDupdate = CMDsync
通过options,args = parser.parse_args(args),得到对CMDsync命令参数的解析,然后调用Gclient.LoadCurrentConfig(options),为简单起见,我们输入的是gclient sync,那么options和args为空。

############glcient.py#############
def LoadCurrentConfig(options):
    """Searches for and loads a .gclient file relative to the current working
    dir. Returns a GClient object."""
    if options.spec:
      client = GClient('.', options)
      client.SetConfig(options.spec)
    else:
      path = gclient_utils.FindGclientRoot(os.getcwd(), options.config_filename)
      if not path:
        return None
      client = GClient(path, options)
      client.SetConfig(gclient_utils.FileRead(
          os.path.join(path, options.config_filename)))

    if (options.revisions and
        len(client.dependencies) > 1 and
        any('@' not in r for r in options.revisions)):
      print >> sys.stderr, (
          'You must specify the full solution name like --revision %s@%s\n'
          'when you have multiple solutions setup in your .gclient file.\n'
          'Other solutions present are: %s.') % (
              client.dependencies[0].name,
              options.revisions[0],
              ', '.join(s.name for s in client.dependencies[1:]))
    return client

该函数查询当前目录下.glcient文件,最终返回client对象。该函数会调用到GClient(path, options):

############## gclient.py ##################
class GClient(Dependency):
  """Object that represent a gclient checkout. A tree of Dependency(), one per
  solution or DEPS entry."""

  DEPS_OS_CHOICES = {
    "win32": "win",
    "win": "win",
    "cygwin": "win",
    "darwin": "mac",
    "mac": "mac",
    "unix": "unix",
    "linux": "unix",
    "linux2": "unix",
    "linux3": "unix",
    "android": "android",
  }

  DEFAULT_CLIENT_FILE_TEXT = ("""\
solutions = [
  { "name"        : "%(solution_name)s",
    "url"         : "%(solution_url)s",
    "deps_file"   : "%(deps_file)s",
    "managed"     : %(managed)s,
    "custom_deps" : {
    },
    "safesync_url": "%(safesync_url)s",
  },
]
cache_dir = %(cache_dir)r
""")

  DEFAULT_SNAPSHOT_SOLUTION_TEXT = ("""\
  { "name"        : "%(solution_name)s",
    "url"         : "%(solution_url)s",
    "deps_file"   : "%(deps_file)s",
    "managed"     : %(managed)s,
    "custom_deps" : {
%(solution_deps)s    },
    "safesync_url": "%(safesync_url)s",
  },
""")

  DEFAULT_SNAPSHOT_FILE_TEXT = ("""\
# Snapshot generated with gclient revinfo --snapshot
solutions = [
%(solution_list)s]
""")

  def __init__(self, root_dir, options):
    # Do not change previous behavior. Only solution level and immediate DEPS
    # are processed.
    self._recursion_limit = 2
    Dependency.__init__(self, None, None, None, None, True, None, None, None,
                        'unused', True)
    self._options = options
    if options.deps_os:
      enforced_os = options.deps_os.split(',')
    else:
      enforced_os = [self.DEPS_OS_CHOICES.get(sys.platform, 'unix')]
    if 'all' in enforced_os:
      enforced_os = self.DEPS_OS_CHOICES.itervalues()
    self._enforced_os = tuple(set(enforced_os))
    self._root_dir = root_dir
    self.config_content = None

  def SetConfig(self, content):
    assert not self.dependencies
    config_dict = {}
    self.config_content = content
    try:
      exec(content, config_dict)
    except SyntaxError, e:
      gclient_utils.SyntaxErrorToError('.gclient', e)

    # Append any target OS that is not already being enforced to the tuple.
    target_os = config_dict.get('target_os', [])
    if config_dict.get('target_os_only', False):
      self._enforced_os = tuple(set(target_os))
    else:
      self._enforced_os = tuple(set(self._enforced_os).union(target_os))

    gclient_scm.GitWrapper.cache_dir = config_dict.get('cache_dir')

    if not target_os and config_dict.get('target_os_only', False):
      raise gclient_utils.Error('Can\'t use target_os_only if target_os is '
                                'not specified')

    deps_to_add = []
    for s in config_dict.get('solutions', []):
      try:
        deps_to_add.append(Dependency(
            self, s['name'], s['url'],
            s.get('safesync_url', None),
            s.get('managed', True),
            s.get('custom_deps', {}),
            s.get('custom_vars', {}),
            s.get('custom_hooks', []),
            s.get('deps_file', 'DEPS'),
            True))
      except KeyError:
        raise gclient_utils.Error('Invalid .gclient file. Solution is '
                                  'incomplete: %s' % s)
    self.add_dependencies_and_close(deps_to_add, config_dict.get('hooks', []))
    logging.info('SetConfig() done')

  def SaveConfig(self):
    gclient_utils.FileWrite(os.path.join(self.root_dir,
                                         self._options.config_filename),
                            self.config_content)

      

  def RunOnDeps(self, command, args, ignore_requirements=False, progress=True):
    """Runs a command on each dependency in a client and its dependencies.

    Args:
      command: The command to use (e.g., 'status' or 'diff')
      args: list of str - extra arguments to add to the command line.
    """
    if not self.dependencies:
      raise gclient_utils.Error('No solution specified')
    revision_overrides = {}
    # It's unnecessary to check for revision overrides for 'recurse'.
    # Save a few seconds by not calling _EnforceRevisions() in that case.
    if command not in ('diff', 'recurse', 'runhooks', 'status'):
      revision_overrides = self._EnforceRevisions()
    pm = None
    # Disable progress for non-tty stdout.
    if (sys.stdout.isatty() and not self._options.verbose and progress):
      if command in ('update', 'revert'):
        pm = Progress('Syncing projects', 1)
      elif command == 'recurse':
        pm = Progress(' '.join(args), 1)
    work_queue = gclient_utils.ExecutionQueue(
        self._options.jobs, pm, ignore_requirements=ignore_requirements)
    for s in self.dependencies:
      work_queue.enqueue(s)
    work_queue.flush(revision_overrides, command, args, options=self._options)

    # Once all the dependencies have been processed, it's now safe to run the
    # hooks.
    if not self._options.nohooks:
      self.RunHooksRecursively(self._options)

    if command == 'update':
      # Notify the user if there is an orphaned entry in their working copy.
      # Only delete the directory if there are no changes in it, and
      # delete_unversioned_trees is set to true.
      entries = [i.name for i in self.root.subtree(False) if i.url]
      full_entries = [os.path.join(self.root_dir, e.replace('/', os.path.sep))
                      for e in entries]

      for entry, prev_url in self._ReadEntries().iteritems():
        if not prev_url:
          # entry must have been overridden via .gclient custom_deps
          continue
        # Fix path separator on Windows.
        entry_fixed = entry.replace('/', os.path.sep)
        e_dir = os.path.join(self.root_dir, entry_fixed)

        def _IsParentOfAny(parent, path_list):
          parent_plus_slash = parent + '/'
          return any(
              path[:len(parent_plus_slash)] == parent_plus_slash
              for path in path_list)

        # Use entry and not entry_fixed there.
        if (entry not in entries and
            (not any(path.startswith(entry + '/') for path in entries)) and
            os.path.exists(e_dir)):
          scm = gclient_scm.CreateSCM(prev_url, self.root_dir, entry_fixed)

          # Check to see if this directory is now part of a higher-up checkout.
          if scm.GetCheckoutRoot() in full_entries:
            logging.info('%s is part of a higher level checkout, not '
                         'removing.', scm.GetCheckoutRoot())
            continue

          file_list = []
          scm.status(self._options, [], file_list)
          modified_files = file_list != []
          if (not self._options.delete_unversioned_trees or
              (modified_files and not self._options.force)):
            # There are modified files in this entry. Keep warning until
            # removed.
            print(('\nWARNING: \'%s\' is no longer part of this client.  '
                   'It is recommended that you manually remove it.\n') %
                      entry_fixed)
          else:
            # Delete the entry
            print('\n________ deleting \'%s\' in \'%s\'' % (
                entry_fixed, self.root_dir))
            gclient_utils.rmtree(e_dir)
      # record the current list of entries for next time
      self._SaveEntries()
    return 0 

client.SetConfig()读取配置文件里的solutions,构建dependencies的list变量deps_to_add,然后调用.add_dependencies_and_close,参数就是包含deps_to_add。在add_dependencies_and_close函数里验证deps,如果是有效的,就加入到依赖树中。

在CMDsync函数里,执行client.RunOnDeps()。它将dependengcies的每个任务加入到gclient_tuils.ExecutionQueue队列并执行。完成所有任务后执行runhook。

######## src/DEPS #############

hooks = [
  {
    # A change to a .gyp, .gypi, or to GYP itself should run the generator.
    "pattern": ".",
    "action": ["python", "src/build/gyp_chromium"],
  },
]

上面是src下的DEPS关于hooks的部分,也就是执行完sync命令后,文件更新就执行src/build/gyp_chromium文件。





;