Existing prompts typically emphasize either global or local information, which fails to excavate or fully leverage the effective information of cross-modal data, resulting in the subpar performance of ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results